TY - GEN
T1 - Revisiting network support for RDMA
AU - Mittal, Radhika
AU - Shpiner, Alexander
AU - Panda, Aurojit
AU - Zahavi, Eitan
AU - Krishnamurthy, Arvind
AU - Ratnasamy, Sylvia
AU - Shenker, Scott
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/8/7
Y1 - 2018/8/7
N2 - The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.
AB - The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks. Rather than seek to fix these issues, we instead ask: is PFC fundamentally required to support RDMA over Ethernet? We show that the need for PFC is an artifact of current RoCE NIC designs rather than a fundamental requirement. We propose an improved RoCE NIC (IRN) design that makes a few simple changes to the RoCE NIC for better handling of packet losses. We show that IRN (without PFC) outperforms RoCE (with PFC) by 6-83% for typical network scenarios. Thus not only does IRN eliminate the need for PFC, it improves performance in the process! We further show that the changes that IRN introduces can be implemented with modest overheads of about 3-10% to NIC resources. Based on our results, we argue that research and industry should rethink the current trajectory of network support for RDMA.
KW - Datacenter transport
KW - IWARP
KW - PFC
KW - RDMA
KW - RoCE
UR - http://www.scopus.com/inward/record.url?scp=85056407588&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056407588&partnerID=8YFLogxK
U2 - 10.1145/3230543.3230557
DO - 10.1145/3230543.3230557
M3 - Conference contribution
AN - SCOPUS:85056407588
T3 - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
SP - 313
EP - 326
BT - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
PB - Association for Computing Machinery
T2 - 2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018
Y2 - 20 August 2018 through 25 August 2018
ER -