## Abstract

We consider the problem of communication avoidance in computing interactions between a set of particles in scenarios with and without a cutoff radius for interaction. Our strategy, which we show to be optimal in communication, divides the work in the iteration space rather than simply dividing the particles over processors, so more than one processor may be responsible for computing updates to a single particle. Similar to a force decomposition in molecular dynamics, this approach requires up to √p times more memory than a particle decomposition, but reduces communication costs by factors up to √p and is often faster in practice than a particle decomposition [1]. We examine a generalized force decomposition algorithm that tolerates the memory limited case, i.e. when memory can only hold c copies of the particles for c = 1, 2,...,√p. When c = 1, the algorithm degenerates into a particle decomposition, similarly when c = √p, the algorithm uses a force decomposition. We present a proof that the algorithm is communication-optimal and reduces critical path latency and bandwidth costs by factors of c^{2} and c, respectively. Performance results from experiments on up to 24K cores of Cray XE-6 and 32K cores of IBM Blue Gene/P machines indicate that the algorithm reduces communication in practice. In some cases, it even outperforms the original force decomposition approach because the right choice of c strikes a balance between the costs of collective and point-to-point communication. Finally, we extend the analysis to include a cutoff radius for direct evaluation of force interactions. We show that with a cutoff, communication optimality still holds. We sketch a generalized algorithm for multi-dimensional space and assess its performance for 1D and 2D simulations on the same systems.

Original language | English (US) |
---|---|

Pages | 1075-1084 |

Number of pages | 10 |

DOIs | |

State | Published - 2013 |

Externally published | Yes |

Event | 27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 - Boston, MA, United States Duration: May 20 2013 → May 24 2013 |

### Other

Other | 27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 |
---|---|

Country/Territory | United States |

City | Boston, MA |

Period | 5/20/13 → 5/24/13 |

## Keywords

- communication-avoiding algorithms
- parallel algorithms
- particle methods

## ASJC Scopus subject areas

- Software