TY - GEN
T1 - NetDIMM
T2 - 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019
AU - Alian, Mohammad
AU - Kim, Nam Sung
N1 - This work was in part supported by an NSF grant (CNS-1705047).
PY - 2019/10/12
Y1 - 2019/10/12
N2 - Optimizing bandwidth was the main focus of designing scale-out networks for several decades and this optimization trend has served well the traditional Internet applications. However, the emergence of datacenters as single computer entities has made latency as important as bandwidth in designing datacenter networks. PCIe interconnect is known to be latency bottleneck in communication networks as its latency overhead can contribute to up to ~90% of the overall communication latency. Despite its overheads, PCIe is the de facto interconnect standard in servers as it has been well established and maintained for more than two decades. In addition to PCIe overhead, data movements in network software stack consume thousands of processor cycles and make ultra-low latency networking more challenging. Tackling PCIe and data movement overheads, we architect NetDIMM, a near-memory network interface card capable of in-memory buffer cloning. NetDIMM places a network interface card chip into the buffer device of a dual in-line memory module and leverages the asynchronous memory access capability of DDR5 to share the memory modules between the host processor and near-memory NIC. Our evaluation shows NetDIMM, on average, improves per packet latency by 49.9% compared with a baseline network deploying PCIe NICs.
AB - Optimizing bandwidth was the main focus of designing scale-out networks for several decades and this optimization trend has served well the traditional Internet applications. However, the emergence of datacenters as single computer entities has made latency as important as bandwidth in designing datacenter networks. PCIe interconnect is known to be latency bottleneck in communication networks as its latency overhead can contribute to up to ~90% of the overall communication latency. Despite its overheads, PCIe is the de facto interconnect standard in servers as it has been well established and maintained for more than two decades. In addition to PCIe overhead, data movements in network software stack consume thousands of processor cycles and make ultra-low latency networking more challenging. Tackling PCIe and data movement overheads, we architect NetDIMM, a near-memory network interface card capable of in-memory buffer cloning. NetDIMM places a network interface card chip into the buffer device of a dual in-line memory module and leverages the asynchronous memory access capability of DDR5 to share the memory modules between the host processor and near-memory NIC. Our evaluation shows NetDIMM, on average, improves per packet latency by 49.9% compared with a baseline network deploying PCIe NICs.
KW - Near-memory computing
KW - Network architecture
UR - https://www.scopus.com/pages/publications/85074448756
UR - https://www.scopus.com/pages/publications/85074448756#tab=citedBy
U2 - 10.1145/3352460.3358278
DO - 10.1145/3352460.3358278
M3 - Conference contribution
AN - SCOPUS:85074448756
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 699
EP - 711
BT - MICRO 2019 - 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Proceedings
PB - IEEE Computer Society
Y2 - 12 October 2019 through 16 October 2019
ER -