System Virtualization for Neural Processing Units

Yuqi Xue, Yiqi Liu, Jian Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Modern cloud platforms have been employing hardware accelerators such as neural processing units (NPUs) to meet the increasing demand for computing resources for AI-based application services. However, due to the lack of system virtualization support, the current way of using NPUs in cloud platforms suffers from either low resource utilization or poor isolation between multi-tenant application services. In this paper, we investigate the system virtualization techniques for NPUs across the entire software and hardware stack, and present our NPU virtualization solution named NeuCloud. We propose a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management. We leverage this abstraction and design the vNPU allocation, mapping, and scheduling policies to maximize the resource utilization, while achieving both performance and security isolation for vNPU instances at runtime.

Original languageEnglish (US)
Title of host publicationHotOS 2023 - Proceedings of the 19th Workshop on Hot Topics in Operating Systems
PublisherAssociation for Computing Machinery
Pages80-86
Number of pages7
ISBN (Electronic)9798400701955
DOIs
StatePublished - Jun 22 2023
Event19th Workshop on Hot Topics in Operating Systems, HotOS 2023 - Providence, United States
Duration: Jun 22 2023Jun 24 2023

Publication series

NameHotOS 2023 - Proceedings of the 19th Workshop on Hot Topics in Operating Systems

Conference

Conference19th Workshop on Hot Topics in Operating Systems, HotOS 2023
Country/TerritoryUnited States
CityProvidence
Period6/22/236/24/23

Keywords

  • accelerator virtualization
  • cloud computing
  • hardware accelerator
  • neural processing unit

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems

Fingerprint

Dive into the research topics of 'System Virtualization for Neural Processing Units'. Together they form a unique fingerprint.

Cite this