Improving communication performance in dense linear algebra via topology aware collectives

Edgar Solomonik, Abhinav Bhatele, James Demmel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent results have shown that topology aware mapping reduces network contention in communication-intensive kernels on massively parallel machines. We demonstrate that on mesh interconnects, topology aware mapping also allows for the utilization of highly-efficient topology aware collectives. We map novel 2.5D dense linear algebra algorithms to exploit rectangular collectives on cuboid partitions allocated by a Blue Gene/P supercomputer. Our mappings allow the algorithms to exploit optimized line multicasts and reductions. Commonly used 2D algorithms cannot be mapped in this fashion. On 16,384 nodes (65,536 cores) of Blue Gene/P, 2.5D algorithms that exploit rectangular collectives are sig- nificantly faster than 2D matrix multiplication (MM) and LU factorization, up to 8.7x and 2.1x, respectively. These speed-ups are due to communication reduction (up to 95.6% for 2.5D MM with respect to 2D MM). We also derive LogP- based novel performance models for rectangular broadcasts and reductions. Using those, we model the performance of matrix multiplication and LU factorization on a hypothetical exascale architecture.

Original languageEnglish (US)
Title of host publicationProceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC11 - Seattle, WA, United States
Duration: Nov 12 2011Nov 18 2011

Publication series

NameProceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

Other

Other2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC11
Country/TerritoryUnited States
CitySeattle, WA
Period11/12/1111/18/11

Keywords

  • Communication
  • Exascale
  • Interconnect topology
  • Mapping
  • Performance

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Improving communication performance in dense linear algebra via topology aware collectives'. Together they form a unique fingerprint.

Cite this