TY - GEN
T1 - GPU-Accelerated Error-Bounded Compression Framework for Quantum Circuit Simulations
AU - Shah, Milan
AU - Yu, Xiaodong
AU - Di, Sheng
AU - Lykov, Danylo
AU - Alexeev, Yuri
AU - Becchi, Michela
AU - Cappello, Franck
N1 - This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaboratvie effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR), under contract DE-AC02-06CH11357,and supported by the National Science Foundation under Grant OAC-2003709 and OAC-2104023. We acknowledgethe computing resources provided on ThetaGPU and JLSE (operated by Argonne Leadership Computing Facility). At NC State, the project was supported through subcontract DE-AC02-06CH11357and National Science Foundation’s award CNS-1812727.
PY - 2023
Y1 - 2023
N2 - Quantum circuit simulations enable researchers to develop quantum algorithms without the need for a physical quantum computer. Quantum computing simulators, however, all suffer from significant memory footprint requirements, which prevents large circuits from being simulated on classical super-computers. In this paper, we explore different lossy compression strategies to substantially shrink quantum circuit tensors in the QTensor package (a state-of-the-art tensor network quantum circuit simulator) while ensuring the reconstructed data satisfy the user-needed fidelity.Our contribution is fourfold. (1) We propose a series of optimized pre- and post-processing steps to boost the compression ratio of tensors with a very limited performance overhead. (2) We characterize the impact of lossy decompressed data on quantum circuit simulation results, and leverage the analysis to ensure the fidelity of reconstructed data. (3) We propose a configurable compression framework for GPU based on cuSZ and cuSZx, two state-of-the-art GPU-accelerated lossy compressors, to address different use-cases: either prioritizing compression ratios or prioritizing compression speed. (4) We perform a comprehensive evaluation by running 9 state-of-the-art compressors on an NVIDIA A100 GPU based on QTensor-generated tensors of varying sizes. When prioritizing compression ratio, our results show that our strategies can increase the compression ratio nearly 10 times compared to using only cuSZ. When prioritizing throughput, we can perform compression at the comparable speed as cuSZx while achieving 3-4× higher compression ratios. Decompressed tensors can be used in QTensor circuit simulation to yield a final energy result within 1-5% of the true energy value.
AB - Quantum circuit simulations enable researchers to develop quantum algorithms without the need for a physical quantum computer. Quantum computing simulators, however, all suffer from significant memory footprint requirements, which prevents large circuits from being simulated on classical super-computers. In this paper, we explore different lossy compression strategies to substantially shrink quantum circuit tensors in the QTensor package (a state-of-the-art tensor network quantum circuit simulator) while ensuring the reconstructed data satisfy the user-needed fidelity.Our contribution is fourfold. (1) We propose a series of optimized pre- and post-processing steps to boost the compression ratio of tensors with a very limited performance overhead. (2) We characterize the impact of lossy decompressed data on quantum circuit simulation results, and leverage the analysis to ensure the fidelity of reconstructed data. (3) We propose a configurable compression framework for GPU based on cuSZ and cuSZx, two state-of-the-art GPU-accelerated lossy compressors, to address different use-cases: either prioritizing compression ratios or prioritizing compression speed. (4) We perform a comprehensive evaluation by running 9 state-of-the-art compressors on an NVIDIA A100 GPU based on QTensor-generated tensors of varying sizes. When prioritizing compression ratio, our results show that our strategies can increase the compression ratio nearly 10 times compared to using only cuSZ. When prioritizing throughput, we can perform compression at the comparable speed as cuSZx while achieving 3-4× higher compression ratios. Decompressed tensors can be used in QTensor circuit simulation to yield a final energy result within 1-5% of the true energy value.
KW - GPU
KW - compression
KW - quantum computing
UR - http://www.scopus.com/inward/record.url?scp=85166622358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85166622358&partnerID=8YFLogxK
U2 - 10.1109/IPDPS54959.2023.00081
DO - 10.1109/IPDPS54959.2023.00081
M3 - Conference contribution
AN - SCOPUS:85166622358
T3 - Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
SP - 757
EP - 767
BT - Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
Y2 - 15 May 2023 through 19 May 2023
ER -