TY - GEN

T1 - A communication-avoiding parallel algorithm for the symmetric eigenvalue problem

AU - Solomonik, Edgar

AU - Ballard, Grey

AU - Demmel, James

AU - Hoefler, Torsten

N1 - Publisher Copyright:
© 2017 Copyright held by the owner/author(s).

PY - 2017/7/24

Y1 - 2017/7/24

N2 - Many large-scale scientific computations require eigenvalue solvers in a scaling regime where efficiency is limited by data movement. We introduce a parallel algorithm for computing the eigenvalues of a dense symmetric matrix, which performs asymptotically less communication than previously known approaches. We provide analysis in the Bulk Synchronous Parallel (BSP) model with additional consideration for communication between a local memory and cache. Given sufficient memory to store c copies of the symmetric matrix, our algorithm requires Θ( √c) less interprocessor communication than previously known algorithms, for any c ≤ p1/3 when using p processors. The algorithm first reduces the dense symmetric matrix to a banded matrix with the same eigenvalues. Subsequently, the algorithm employs successive reduction to O(log p) thinner banded matrices.We employ two newparallel algorithms that achieve lower communication costs for the full-to-band and band-to-band reductions. Both of these algorithms leverage a novel QR factorization algorithm for rectangular matrices.

AB - Many large-scale scientific computations require eigenvalue solvers in a scaling regime where efficiency is limited by data movement. We introduce a parallel algorithm for computing the eigenvalues of a dense symmetric matrix, which performs asymptotically less communication than previously known approaches. We provide analysis in the Bulk Synchronous Parallel (BSP) model with additional consideration for communication between a local memory and cache. Given sufficient memory to store c copies of the symmetric matrix, our algorithm requires Θ( √c) less interprocessor communication than previously known algorithms, for any c ≤ p1/3 when using p processors. The algorithm first reduces the dense symmetric matrix to a banded matrix with the same eigenvalues. Subsequently, the algorithm employs successive reduction to O(log p) thinner banded matrices.We employ two newparallel algorithms that achieve lower communication costs for the full-to-band and band-to-band reductions. Both of these algorithms leverage a novel QR factorization algorithm for rectangular matrices.

UR - http://www.scopus.com/inward/record.url?scp=85027876294&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027876294&partnerID=8YFLogxK

U2 - 10.1145/3087556.3087561

DO - 10.1145/3087556.3087561

M3 - Conference contribution

AN - SCOPUS:85027876294

T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures

SP - 111

EP - 121

BT - SPAA 2017 - Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures

PB - Association for Computing Machinery

T2 - 29th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2017

Y2 - 24 July 2017 through 26 July 2017

ER -