TY - GEN
T1 - System Virtualization for Neural Processing Units
AU - Xue, Yuqi
AU - Liu, Yiqi
AU - Huang, Jian
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/6/22
Y1 - 2023/6/22
N2 - Modern cloud platforms have been employing hardware accelerators such as neural processing units (NPUs) to meet the increasing demand for computing resources for AI-based application services. However, due to the lack of system virtualization support, the current way of using NPUs in cloud platforms suffers from either low resource utilization or poor isolation between multi-tenant application services. In this paper, we investigate the system virtualization techniques for NPUs across the entire software and hardware stack, and present our NPU virtualization solution named NeuCloud. We propose a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management. We leverage this abstraction and design the vNPU allocation, mapping, and scheduling policies to maximize the resource utilization, while achieving both performance and security isolation for vNPU instances at runtime.
AB - Modern cloud platforms have been employing hardware accelerators such as neural processing units (NPUs) to meet the increasing demand for computing resources for AI-based application services. However, due to the lack of system virtualization support, the current way of using NPUs in cloud platforms suffers from either low resource utilization or poor isolation between multi-tenant application services. In this paper, we investigate the system virtualization techniques for NPUs across the entire software and hardware stack, and present our NPU virtualization solution named NeuCloud. We propose a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management. We leverage this abstraction and design the vNPU allocation, mapping, and scheduling policies to maximize the resource utilization, while achieving both performance and security isolation for vNPU instances at runtime.
KW - accelerator virtualization
KW - cloud computing
KW - hardware accelerator
KW - neural processing unit
UR - http://www.scopus.com/inward/record.url?scp=85166206053&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85166206053&partnerID=8YFLogxK
U2 - 10.1145/3593856.3595912
DO - 10.1145/3593856.3595912
M3 - Conference contribution
AN - SCOPUS:85166206053
T3 - HotOS 2023 - Proceedings of the 19th Workshop on Hot Topics in Operating Systems
SP - 80
EP - 86
BT - HotOS 2023 - Proceedings of the 19th Workshop on Hot Topics in Operating Systems
PB - Association for Computing Machinery
T2 - 19th Workshop on Hot Topics in Operating Systems, HotOS 2023
Y2 - 22 June 2023 through 24 June 2023
ER -