In this paper, we present a new sparse matrix data format that leads to improved memory coalescing and more efficient sparse matrix-vector multiplication for a wide range of problems on high-throughput architectures such as a GPU. The sparse matrix structure is constructed by sorting the rows based on the row length (defined as the number of non-zero elements in a matrix row) followed by a partition into two ranges, short rows and long rows. Based on this partition, the matrix entries are then transformed into ELLPACK or vectorized compressed sparse row format. In addition, the number of threads are adaptively selected by their row length, in order to balance the workload for each graphics processing unit thread. Several computational experiments are presented to support this approach and the results suggest a notable improvement over a wide range of matrix structures.

Original languageEnglish (US)
Pages (from-to)103-120
Number of pages18
JournalInternational Journal of High Performance Computing Applications
Issue number1
StatePublished - Feb 1 2016


  • EVC-HYB format
  • Sparse matrix-vector multiplication
  • adaptive
  • graphics processing unit
  • memory coalescing

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Software
  • Hardware and Architecture


Dive into the research topics of 'A hybrid format for better performance of sparse matrix-vector multiplication on a GPU'. Together they form a unique fingerprint.

Cite this