Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations

Edgar Solomonik, Erin Carson, Nicholas Knight, James Demmel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper derives tradeoffs between three basic costs of a parallel algorithm: synchronization, data movement, and computational cost. These tradeoffs are lower bounds on the execution time of the algorithm which are independent of the number of processors, but dependent on the problem size. Therefore, they provide lower bounds on the parallel execution time of any algorithm computed by a system composed of any number of homogeneous components, each with associated computational, communication, and synchronization payloads. We employ a theoretical model counts the amount of work and data movement as a maximum of any execution path during the parallel computation. By considering this metric, rather than the total communication volume over the whole machine, we obtain new insights into the characteristics of parallel schedules for algorithms with non-trivial dependency structures. We also present reductions from BSP and LogP algorithms to our execution model, extending our lower bounds to these two models of parallel computation. We first develop our results for general dependency graphs and hypergraphs based on their expansion properties, then we apply the theorem to a number of specific algorithms in numerical linear algebra, namely triangular substitution, Gaussian elimination, and Krylov subspace methods. Our lower bound for LU factorization demonstrates the optimality of Tiskin's LU algorithm [26] answering an open question posed in his paper, as well as of the 2.5D LU [21] algorithm which has analogous costs. We treat the computations in a general manner by noting that the computations share a similar dependency hypergraph structure and and analyzing the communication requirements of lattice hypergraph structures.

Original languageEnglish (US)
Title of host publicationSPAA 2014 - Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures
PublisherAssociation for Computing Machinery
Pages307-318
Number of pages12
ISBN (Print)9781450328210
DOIs
StatePublished - 2014
Externally publishedYes
Event26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2014 - Prague, Czech Republic
Duration: Jun 23 2014Jun 25 2014

Publication series

NameAnnual ACM Symposium on Parallelism in Algorithms and Architectures

Other

Other26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2014
Country/TerritoryCzech Republic
CityPrague
Period6/23/146/25/14

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations'. Together they form a unique fingerprint.

Cite this