TY - GEN
T1 - High-throughput and Flexible Host Networking for Accelerated Computing
AU - Skiadopoulos, Athinagoras
AU - Xie, Zhiqiang
AU - Zhao, Mark
AU - Cai, Qizhe
AU - Agarwal, Saksham
AU - Adelmann, Jacob
AU - Ahern, David
AU - Contavalli, Carlo
AU - Goldflam, Michael
AU - Mayatskikh, Vitaly
AU - Raja, Raghu
AU - Walton, Daniel
AU - Agarwal, Rachit
AU - Mukherjee, Shrijeet
AU - Kozyrakis, Christos
N1 - Publisher Copyright:
© OSDI 2024.All rights reserved.
PY - 2024
Y1 - 2024
N2 - Modern network hardware is able to meet the stringent bandwidth demands of applications like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff between performance (in terms of sustained throughput when compared to network hardware capacity) and flexibility (in terms of the ability to select, customize, and extend different network protocols). This paper explores a clean-slate approach to simultaneously offer high performance and flexibility. We present a co-design of the NIC hardware and the software stack to achieve this. The key idea in our design is the physical separation of the data path (payload transfer between network and application buffers) and the control path (header processing and transport-layer decisions). The NIC enables a high-performance zero-copy data path, independent of the placement of the application (CPU, GPU, FPGA, or other accelerators). The software stack provides a flexible control path by enabling the integration of any network protocol, executing in any environment (in the kernel, in user space, or in an accelerator). We implement and evaluate ZeroNIC, a prototype that combines an FPGA-based NIC with a software stack that integrates the Linux TCP protocol. We demonstrate that ZeroNIC achieves RDMA-like throughput while maintaining the benefits of robust protocols like TCP under various network perturbations. For instance, ZeroNIC enables a single TCP flow to saturate a 100Gbps link while utilizing only 17% of a single CPU core. ZeroNIC improves NCCL and Redis throughput by 2.66× and 3.71×, respectively, over Linux TCP on a Mellanox ConnectX-6 NIC, without requiring application modifications.
AB - Modern network hardware is able to meet the stringent bandwidth demands of applications like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff between performance (in terms of sustained throughput when compared to network hardware capacity) and flexibility (in terms of the ability to select, customize, and extend different network protocols). This paper explores a clean-slate approach to simultaneously offer high performance and flexibility. We present a co-design of the NIC hardware and the software stack to achieve this. The key idea in our design is the physical separation of the data path (payload transfer between network and application buffers) and the control path (header processing and transport-layer decisions). The NIC enables a high-performance zero-copy data path, independent of the placement of the application (CPU, GPU, FPGA, or other accelerators). The software stack provides a flexible control path by enabling the integration of any network protocol, executing in any environment (in the kernel, in user space, or in an accelerator). We implement and evaluate ZeroNIC, a prototype that combines an FPGA-based NIC with a software stack that integrates the Linux TCP protocol. We demonstrate that ZeroNIC achieves RDMA-like throughput while maintaining the benefits of robust protocols like TCP under various network perturbations. For instance, ZeroNIC enables a single TCP flow to saturate a 100Gbps link while utilizing only 17% of a single CPU core. ZeroNIC improves NCCL and Redis throughput by 2.66× and 3.71×, respectively, over Linux TCP on a Mellanox ConnectX-6 NIC, without requiring application modifications.
UR - http://www.scopus.com/inward/record.url?scp=85201271300&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201271300&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85201271300
T3 - Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024
SP - 405
EP - 423
BT - Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024
PB - USENIX Association
T2 - 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024
Y2 - 10 July 2024 through 12 July 2024
ER -