### Abstract

We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n^{3}/p) computation time and 0(n^{2}/p^{ 2 3}) communication steps using p processors (for p = 0(n^{3}/log^{ 3 2} n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2D_{opt}(τ) steps, where D_{opt}(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off.

Original language | English (US) |
---|---|

Pages (from-to) | 3-28 |

Number of pages | 26 |

Journal | Theoretical Computer Science |

Volume | 71 |

Issue number | 1 |

DOIs | |

State | Published - Mar 13 1990 |

Externally published | Yes |

### Fingerprint

### ASJC Scopus subject areas

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Theoretical Computer Science*,

*71*(1), 3-28. https://doi.org/10.1016/0304-3975(90)90188-N

**Communication complexity of PRAMs.** / Aggarwal, Alok; Chandra, Ashok K.; Snir, Marc.

Research output: Contribution to journal › Article

*Theoretical Computer Science*, vol. 71, no. 1, pp. 3-28. https://doi.org/10.1016/0304-3975(90)90188-N

}

TY - JOUR

T1 - Communication complexity of PRAMs

AU - Aggarwal, Alok

AU - Chandra, Ashok K.

AU - Snir, Marc

PY - 1990/3/13

Y1 - 1990/3/13

N2 - We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n3/p) computation time and 0(n2/p 2 3) communication steps using p processors (for p = 0(n3/log 3 2 n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2Dopt(τ) steps, where Dopt(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off.

AB - We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n3/p) computation time and 0(n2/p 2 3) communication steps using p processors (for p = 0(n3/log 3 2 n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2Dopt(τ) steps, where Dopt(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off.

UR - http://www.scopus.com/inward/record.url?scp=0025231126&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0025231126&partnerID=8YFLogxK

U2 - 10.1016/0304-3975(90)90188-N

DO - 10.1016/0304-3975(90)90188-N

M3 - Article

AN - SCOPUS:0025231126

VL - 71

SP - 3

EP - 28

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

IS - 1

ER -