TY - GEN
T1 - S2TAR
T2 - 17th IEEE International Conference on Cloud Computing, CLOUD 2024
AU - Ren, Wei
AU - Koteshwara, Sandhya
AU - Ye, Mengmei
AU - Franke, Hubertus
AU - Chen, Deming
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The demand for hardware accelerators such as Tensor Processing Units (TPUs) and Graphics Processing Units (GPUs) is rapidly increasing due to growing Machine Learning (ML) workloads. As with any shared computing resources, there is a growing need to dynamically adjust and scale accelerator services while ensuring data privacy and confidentiality, especially in cloud environments. We propose a secure and reconfigurable TPU design with confidential computing support, achieved through a Trusted Execution Environment (TEE) framework tailored for reconfigurable TPU in a multi-tenant cloud. Our contributions include a novel TPU design based on switchbox-enabled systolic arrays to support rapid dynamic partitioning. We evaluate our TPU design with TEEs in shared environments, achieving up to 42.1 % higher performance for realistic ML inference workloads. Our remote attestation protocol extends to sub-device partitions, providing trustworthiness on a fine-grained level and decouples host and accelerator TEEs into separate attestation reports without degrading security guarantees. Our work presents a new TEE framework for secure and reconfigurable ML accelerators in a multi-tenant cloud environment.
AB - The demand for hardware accelerators such as Tensor Processing Units (TPUs) and Graphics Processing Units (GPUs) is rapidly increasing due to growing Machine Learning (ML) workloads. As with any shared computing resources, there is a growing need to dynamically adjust and scale accelerator services while ensuring data privacy and confidentiality, especially in cloud environments. We propose a secure and reconfigurable TPU design with confidential computing support, achieved through a Trusted Execution Environment (TEE) framework tailored for reconfigurable TPU in a multi-tenant cloud. Our contributions include a novel TPU design based on switchbox-enabled systolic arrays to support rapid dynamic partitioning. We evaluate our TPU design with TEEs in shared environments, achieving up to 42.1 % higher performance for realistic ML inference workloads. Our remote attestation protocol extends to sub-device partitions, providing trustworthiness on a fine-grained level and decouples host and accelerator TEEs into separate attestation reports without degrading security guarantees. Our work presents a new TEE framework for secure and reconfigurable ML accelerators in a multi-tenant cloud environment.
KW - Cloud Computing
KW - Confidential Computing
KW - Dynamic Partitioning
KW - Hardware Accelerators
KW - Tensor Processing Units
KW - Trusted Execution En-vironment
UR - http://www.scopus.com/inward/record.url?scp=85203250949&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85203250949&partnerID=8YFLogxK
U2 - 10.1109/CLOUD62652.2024.00038
DO - 10.1109/CLOUD62652.2024.00038
M3 - Conference contribution
AN - SCOPUS:85203250949
T3 - IEEE International Conference on Cloud Computing, CLOUD
SP - 267
EP - 278
BT - Proceedings - 2024 IEEE 17th International Conference on Cloud Computing, CLOUD 2024
A2 - Chang, Rong N.
A2 - Chang, Carl K.
A2 - Yang, Jingwei
A2 - Atukorala, Nimanthi
A2 - Jin, Zhi
A2 - Sheng, Michael
A2 - Fan, Jing
A2 - Fletcher, Kenneth
A2 - He, Qiang
A2 - Kosar, Tevfik
A2 - Sarkar, Santonu
A2 - Venkateswaran, Sreekrishnan
A2 - Wang, Shangguang
A2 - Liu, Xuanzhe
A2 - Seelam, Seetharami
A2 - Narayanaswami, Chandra
A2 - Zong, Ziliang
PB - IEEE Computer Society
Y2 - 7 July 2024 through 13 July 2024
ER -