Revisiting network support for RDMA

Radhika Mittal, Alexander Shpiner, Aurojit Panda, Eitan Zahavi, Arvind Krishnamurthy, Sylvia Ratnasamy, Scott Shenker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.

Original languageEnglish (US)
Title of host publicationSIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
PublisherAssociation for Computing Machinery
Pages313-326
Number of pages14
ISBN (Electronic)9781450355674
DOIs
StatePublished - Aug 7 2018
Externally publishedYes
Event2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018 - Budapest, Hungary
Duration: Aug 20 2018Aug 25 2018

Publication series

NameSIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication

Other

Other2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018
Country/TerritoryHungary
CityBudapest
Period8/20/188/25/18

Keywords

  • Datacenter transport
  • IWARP
  • PFC
  • RDMA
  • RoCE

ASJC Scopus subject areas

  • Communication
  • Electrical and Electronic Engineering
  • Computer Networks and Communications
  • Signal Processing

Fingerprint

Dive into the research topics of 'Revisiting network support for RDMA'. Together they form a unique fingerprint.

Cite this