## Abstract

We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n^{3}/p) computation time and 0(n^{2}/p^{ 2 3}) communication steps using p processors (for p = 0(n^{3}/log^{ 3 2} n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2D_{opt}(τ) steps, where D_{opt}(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off.

Original language | English (US) |
---|---|

Pages (from-to) | 3-28 |

Number of pages | 26 |

Journal | Theoretical Computer Science |

Volume | 71 |

Issue number | 1 |

DOIs | |

State | Published - Mar 13 1990 |

Externally published | Yes |

## ASJC Scopus subject areas

- Theoretical Computer Science
- General Computer Science