TY - GEN
T1 - Drill
T2 - 2017 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM 2017
AU - Ghorbani, Soudeh
AU - Yang, Zibin
AU - Godfrey, P. Brighten
AU - Ganjali, Yashar
AU - Firoozshahian, Amin
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/8/7
Y1 - 2017/8/7
N2 - The trend towards simple datacenter network fabric strips most network functionality, including load balancing, out of the network core and pushes it to the edge. This slows reaction to microbursts, the main culprit of packet loss in datacenters.We investigate the opposite direction: could slightly smarter fabric significantly improve load balancing? This paper presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. DRILL employs perpacket decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Our design addresses the resulting key challenges of packet reordering and topological asymmetry. In simulations with a detailed switch hardware model and realistic workloads, DRILL outperforms recent edge-based load balancers, particularly under heavy load. Under 80% load, for example, it achieves 1.3-1.4× lower mean flow completion time than recent proposals, primarily due to shorter upstream queues. To test hardware feasibility, we implement DRILL in Verilog and estimate its area overhead to be less than 1%. Finally, we analyze DRILL's stability and throughput-efficiency.
AB - The trend towards simple datacenter network fabric strips most network functionality, including load balancing, out of the network core and pushes it to the edge. This slows reaction to microbursts, the main culprit of packet loss in datacenters.We investigate the opposite direction: could slightly smarter fabric significantly improve load balancing? This paper presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. DRILL employs perpacket decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Our design addresses the resulting key challenges of packet reordering and topological asymmetry. In simulations with a detailed switch hardware model and realistic workloads, DRILL outperforms recent edge-based load balancers, particularly under heavy load. Under 80% load, for example, it achieves 1.3-1.4× lower mean flow completion time than recent proposals, primarily due to shorter upstream queues. To test hardware feasibility, we implement DRILL in Verilog and estimate its area overhead to be less than 1%. Finally, we analyze DRILL's stability and throughput-efficiency.
KW - Clos
KW - Datacenters
KW - Load balancing
KW - Microbursts
KW - Traffic engineering
UR - http://www.scopus.com/inward/record.url?scp=85029447617&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029447617&partnerID=8YFLogxK
U2 - 10.1145/3098822.3098839
DO - 10.1145/3098822.3098839
M3 - Conference contribution
AN - SCOPUS:85029447617
T3 - SIGCOMM 2017 - Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication
SP - 225
EP - 238
BT - SIGCOMM 2017 - Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication
PB - Association for Computing Machinery
Y2 - 21 August 2017 through 25 August 2017
ER -