Parallelization of Reordering Algorithms for Bandwidth and Wavefront Reduction

Konstantinos I. Karantasis, Andrew Lenharth, Donald Nguyen, María J. Garzarán, Keshav Pingali

Research output: Contribution to journalConference articlepeer-review

Abstract

Many sparse matrix computations can be speeded up if the matrix is first reordered. Reordering was originally developed for direct methods but it has recently become popular for improving the cache locality of parallel iterative solvers since reordering the matrix to reduce bandwidth and wave front can improve the locality of reference of sparse matrix-vector multiplication (SpMV), the key kernel in iterative solvers. In this paper, we present the first parallel implementations of two widely used reordering algorithms: Reverse Cut hill-McKee (RCM) and Sloan. On 16 cores of the Stampede supercomputer, our parallel RCM is 5.56 times faster on the average than a state-of-the-art sequential implementation of RCM in the HSL library. Sloan is significantly more constrained than RCM, but our parallel implementation achieves a speedup of 2.88X on the average over sequential HSL-Sloan. Reordering the matrix using our parallel RCM and then performing 100 SpMV iterations is twice as fast as using HSL-RCM and then performing the SpMV iterations, it is also 1.5 times faster than performing the SpMV iterations without reordering the matrix.

Original languageEnglish (US)
Article number7013062
Pages (from-to)921-932
Number of pages12
JournalInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume2015-January
Issue numberJanuary
DOIs
StatePublished - Jan 16 2014
EventInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014 - New Orleans, United States
Duration: Nov 16 2014Nov 21 2014

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Parallelization of Reordering Algorithms for Bandwidth and Wavefront Reduction'. Together they form a unique fingerprint.

Cite this