A scalable, numerically stable, high-performance tridiagonal solver using GPUs

Li Wen Chang, John A. Stratton, Hee Seok Kim, Wen-Mei W Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a scalable, numerically stable, high-performance tridiagonal solver. The solver is based on the SPIKE algorithm for partitioning a large matrix into small independent matrices, which can be solved in parallel. For each small matrix, our solver applies a general 1-by-1 or 2-by-2 diagonal pivoting algorithm, which is also known to be numerically stable. Our paper makes two major contributions. First, our solver is the first numerically stable tridiagonal solver for GPUs. Our solver provides comparable quality of stable solutions to Intel MKL and Matlab, at speed comparable to the GPU tridiagonal solvers in existing packages like CUSPARSE. It is also scalable to multiple GPUs and CPUs. Second, we present and analyze two key optimization strategies for our solver: a high-throughput data layout transformation for memory efficiency, and a dynamic tiling approach for reducing the memory access footprint caused by branch divergence.

Original languageEnglish (US)
Title of host publication2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012
DOIs
StatePublished - Dec 1 2012
Event2012 24th International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012 - Salt Lake City, UT, United States
Duration: Nov 10 2012Nov 16 2012

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Other

Other2012 24th International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012
CountryUnited States
CitySalt Lake City, UT
Period11/10/1211/16/12

Fingerprint

Data storage equipment
Program processors
Throughput
Graphics processing unit

Keywords

  • GPGPU
  • GPU Computing
  • SPIKE
  • Tridiagonal Solver

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Cite this

Chang, L. W., Stratton, J. A., Kim, H. S., & Hwu, W-M. W. (2012). A scalable, numerically stable, high-performance tridiagonal solver using GPUs. In 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012 [6468510] (International Conference for High Performance Computing, Networking, Storage and Analysis, SC). https://doi.org/10.1109/SC.2012.12

A scalable, numerically stable, high-performance tridiagonal solver using GPUs. / Chang, Li Wen; Stratton, John A.; Kim, Hee Seok; Hwu, Wen-Mei W.

2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012. 2012. 6468510 (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chang, LW, Stratton, JA, Kim, HS & Hwu, W-MW 2012, A scalable, numerically stable, high-performance tridiagonal solver using GPUs. in 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012., 6468510, International Conference for High Performance Computing, Networking, Storage and Analysis, SC, 2012 24th International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012, Salt Lake City, UT, United States, 11/10/12. https://doi.org/10.1109/SC.2012.12
Chang LW, Stratton JA, Kim HS, Hwu W-MW. A scalable, numerically stable, high-performance tridiagonal solver using GPUs. In 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012. 2012. 6468510. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC). https://doi.org/10.1109/SC.2012.12
Chang, Li Wen ; Stratton, John A. ; Kim, Hee Seok ; Hwu, Wen-Mei W. / A scalable, numerically stable, high-performance tridiagonal solver using GPUs. 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012. 2012. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC).
@inproceedings{3e280c37ef2445008d21dfff669af3da,
title = "A scalable, numerically stable, high-performance tridiagonal solver using GPUs",
abstract = "In this paper, we present a scalable, numerically stable, high-performance tridiagonal solver. The solver is based on the SPIKE algorithm for partitioning a large matrix into small independent matrices, which can be solved in parallel. For each small matrix, our solver applies a general 1-by-1 or 2-by-2 diagonal pivoting algorithm, which is also known to be numerically stable. Our paper makes two major contributions. First, our solver is the first numerically stable tridiagonal solver for GPUs. Our solver provides comparable quality of stable solutions to Intel MKL and Matlab, at speed comparable to the GPU tridiagonal solvers in existing packages like CUSPARSE. It is also scalable to multiple GPUs and CPUs. Second, we present and analyze two key optimization strategies for our solver: a high-throughput data layout transformation for memory efficiency, and a dynamic tiling approach for reducing the memory access footprint caused by branch divergence.",
keywords = "GPGPU, GPU Computing, SPIKE, Tridiagonal Solver",
author = "Chang, {Li Wen} and Stratton, {John A.} and Kim, {Hee Seok} and Hwu, {Wen-Mei W}",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/SC.2012.12",
language = "English (US)",
isbn = "9781467308069",
series = "International Conference for High Performance Computing, Networking, Storage and Analysis, SC",
booktitle = "2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012",

}

TY - GEN

T1 - A scalable, numerically stable, high-performance tridiagonal solver using GPUs

AU - Chang, Li Wen

AU - Stratton, John A.

AU - Kim, Hee Seok

AU - Hwu, Wen-Mei W

PY - 2012/12/1

Y1 - 2012/12/1

N2 - In this paper, we present a scalable, numerically stable, high-performance tridiagonal solver. The solver is based on the SPIKE algorithm for partitioning a large matrix into small independent matrices, which can be solved in parallel. For each small matrix, our solver applies a general 1-by-1 or 2-by-2 diagonal pivoting algorithm, which is also known to be numerically stable. Our paper makes two major contributions. First, our solver is the first numerically stable tridiagonal solver for GPUs. Our solver provides comparable quality of stable solutions to Intel MKL and Matlab, at speed comparable to the GPU tridiagonal solvers in existing packages like CUSPARSE. It is also scalable to multiple GPUs and CPUs. Second, we present and analyze two key optimization strategies for our solver: a high-throughput data layout transformation for memory efficiency, and a dynamic tiling approach for reducing the memory access footprint caused by branch divergence.

AB - In this paper, we present a scalable, numerically stable, high-performance tridiagonal solver. The solver is based on the SPIKE algorithm for partitioning a large matrix into small independent matrices, which can be solved in parallel. For each small matrix, our solver applies a general 1-by-1 or 2-by-2 diagonal pivoting algorithm, which is also known to be numerically stable. Our paper makes two major contributions. First, our solver is the first numerically stable tridiagonal solver for GPUs. Our solver provides comparable quality of stable solutions to Intel MKL and Matlab, at speed comparable to the GPU tridiagonal solvers in existing packages like CUSPARSE. It is also scalable to multiple GPUs and CPUs. Second, we present and analyze two key optimization strategies for our solver: a high-throughput data layout transformation for memory efficiency, and a dynamic tiling approach for reducing the memory access footprint caused by branch divergence.

KW - GPGPU

KW - GPU Computing

KW - SPIKE

KW - Tridiagonal Solver

UR - http://www.scopus.com/inward/record.url?scp=84877702106&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877702106&partnerID=8YFLogxK

U2 - 10.1109/SC.2012.12

DO - 10.1109/SC.2012.12

M3 - Conference contribution

AN - SCOPUS:84877702106

SN - 9781467308069

T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC

BT - 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012

ER -