Drill: Micro load balancing for low-latency data center networks

Soudeh Ghorbani, Zibin Yang, P. Brighten Godfrey, Yashar Ganjali, Amin Firoozshahian

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The trend towards simple datacenter network fabric strips most network functionality, including load balancing, out of the network core and pushes it to the edge. This slows reaction to microbursts, the main culprit of packet loss in datacenters.We investigate the opposite direction: could slightly smarter fabric significantly improve load balancing? This paper presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. DRILL employs perpacket decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Our design addresses the resulting key challenges of packet reordering and topological asymmetry. In simulations with a detailed switch hardware model and realistic workloads, DRILL outperforms recent edge-based load balancers, particularly under heavy load. Under 80% load, for example, it achieves 1.3-1.4× lower mean flow completion time than recent proposals, primarily due to shorter upstream queues. To test hardware feasibility, we implement DRILL in Verilog and estimate its area overhead to be less than 1%. Finally, we analyze DRILL's stability and throughput-efficiency.

Original languageEnglish (US)
Title of host publicationSIGCOMM 2017 - Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication
PublisherAssociation for Computing Machinery
Pages225-238
Number of pages14
ISBN (Electronic)9781450346535
DOIs
StatePublished - Aug 7 2017
Event2017 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM 2017 - Los Angeles, United States
Duration: Aug 21 2017Aug 25 2017

Publication series

NameSIGCOMM 2017 - Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication

Other

Other2017 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM 2017
Country/TerritoryUnited States
CityLos Angeles
Period8/21/178/25/17

Keywords

  • Clos
  • Datacenters
  • Load balancing
  • Microbursts
  • Traffic engineering

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Signal Processing
  • Electrical and Electronic Engineering
  • Communication

Fingerprint

Dive into the research topics of 'Drill: Micro load balancing for low-latency data center networks'. Together they form a unique fingerprint.

Cite this