On designing optimal parallel triangular solvers

Research output: Contribution to journalArticlepeer-review


This paper explores the problem of solving triangular linear systems on parallel distributed-memory machines. Working within the LogP model, tight asymptotic bounds for solving these systems using forward/backward substitution are presented. Specifically, lower bounds on execution time independent of the data layout, lower bounds for data layouts in which the number of data items per processor is bounded, and lower bounds for specific data layouts commonly used in designing parallel algorithms for this problem are presented in this paper. Furthermore, algorithms are provided which have running times within a constant factor of the lower bounds described. One interesting result is that the popular two-dimensional block matrix layout necessarily results in significantly longer running times than simpler one-dimensional schemes. Finally, a generalization of the lower bounds to banded triangular linear systems is presented.

Original languageEnglish (US)
Pages (from-to)172-210
Number of pages39
JournalInformation and Computation
Issue number2
StatePublished - Jan 1 2000
Externally publishedYes


  • Distributed-memory
  • LogP model
  • Matrix computation
  • Numerical methods
  • Parallel algorithms and complexity
  • Triangular solvers

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'On designing optimal parallel triangular solvers'. Together they form a unique fingerprint.

Cite this