TY - GEN
T1 - Distributed-memory DMRG via sparse and dense parallel tensor contractions
AU - Levy, Ryan
AU - Solomonik, Edgar
AU - Clark, Bryan K.
N1 - We thank Xiongjie Yu who was the primary developer (under the guidance of BKC) of the serial version of the tensor-tools code on which our parallel version was built. We acknowledge both the use of ITensor to compare against our tensor-tools code as well as the use/modification of their AutoMPO and lattice code directly to facilitate comparison. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. We used XSEDE to employ Stampede2 at the Texas Advanced Computing Center (TACC) through allocation TG-CCR180006. This research is also part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. BKC was supported by DOE de-sc0020165. ES was supported by NSF grant no. 1839204.
PY - 2020/11
Y1 - 2020/11
N2 - The density matrix renormalization group (DMRG) algorithm is a powerful tool for solving eigenvalue problems to model quantum systems. DMRG relies on tensor contractions and dense linear algebra to compute properties of condensed matter physics systems. However, its efficient parallel implementation is challenging due to limited concurrency, large memory footprint, and tensor sparsity. We mitigate these problems by implementing two new parallel approaches that handle block sparsity arising in DMRG, via Cyclops, a distributed memory tensor contraction library. We benchmark their performance on two physical systems using the Blue Waters and Stampede2 supercomputers. Our DMRG performance is improved by up to 5.9X in runtime and 99X in processing rate over ITensor, at roughly comparable computational resource use. This enables higher accuracy calculations via larger tensors for quantum state approximation. We demonstrate that despite having limited concurrency, DMRG is weakly scalable with the use of efficient parallel tensor contraction mechanisms.
AB - The density matrix renormalization group (DMRG) algorithm is a powerful tool for solving eigenvalue problems to model quantum systems. DMRG relies on tensor contractions and dense linear algebra to compute properties of condensed matter physics systems. However, its efficient parallel implementation is challenging due to limited concurrency, large memory footprint, and tensor sparsity. We mitigate these problems by implementing two new parallel approaches that handle block sparsity arising in DMRG, via Cyclops, a distributed memory tensor contraction library. We benchmark their performance on two physical systems using the Blue Waters and Stampede2 supercomputers. Our DMRG performance is improved by up to 5.9X in runtime and 99X in processing rate over ITensor, at roughly comparable computational resource use. This enables higher accuracy calculations via larger tensors for quantum state approximation. We demonstrate that despite having limited concurrency, DMRG is weakly scalable with the use of efficient parallel tensor contraction mechanisms.
KW - Cyclops Tensor Framework
KW - DMRG
KW - quantum systems
KW - sparse tensors
KW - tensor contractions
KW - tensor networks
UR - http://www.scopus.com/inward/record.url?scp=85102355456&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102355456&partnerID=8YFLogxK
U2 - 10.1109/SC41405.2020.00028
DO - 10.1109/SC41405.2020.00028
M3 - Conference contribution
AN - SCOPUS:85102355456
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2020
PB - IEEE Computer Society
T2 - 2020 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020
Y2 - 9 November 2020 through 19 November 2020
ER -