Efficient Error-Bounded Lossy Compression for CPU Architectures

Griffin Dube, Jiannan Tian, Sheng Di, Dingwen Tao, Jon C. Calhoun, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Modern HPC applications produce increasingly large amounts of data, which limits the performance of current extreme-scale systems. Lossy compression, helps to mitigate this issue by decreasing the size of data generated by these applications. SZ, a current state-of-the-art lossy compressor, is able to achieve high compression ratios, but its prediction/quantization methods contain RAW dependencies that prevent parallelizing this step of the compression. Recent work proposes a parallel dual prediction/quantization algorithm for GPUs which removes these dependencies. However, some HPC systems and applications do not use GPUs, and could still benefit from the fine-grained parallelism of this method. Using the dual-quantization technique, we implement and optimize a SIMD vectorized CPU version of SZ (vecSZ), and create a heuristic for selecting the optimal block size and vector length. We propose a novel block padding algorithm to decrease the number of unpredictable values along compression block borders and find it reduces the number of prediction outliers by up to 100%. We measure performance of our vecSZ against an CPU version of SZ using dual-quantization, pSZ, as well as SZ-1.4. Using real-world scientific datasets, we evaluate vecSZ on the Intel Skylake and AMD Rome architectures. vecSZ results in up to 32% improvement in rate-distortion and up to 15× speedup over SZ-1.4, achieving a prediction and quantization bandwidth in excess of 3.4 GB/s.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2022
PublisherIEEE Computer Society
Pages89-96
Number of pages8
ISBN (Electronic)9781665455800
DOIs
StatePublished - 2022
Externally publishedYes
Event30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2022 - Nice, France
Duration: Oct 18 2022Oct 20 2022

Publication series

NameProceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS
Volume2022-October
ISSN (Print)1526-7539

Conference

Conference30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2022
Country/TerritoryFrance
CityNice
Period10/18/2210/20/22

Keywords

  • big data
  • compression
  • lossy compression
  • program optimization
  • veetorization

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Networks and Communications
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Efficient Error-Bounded Lossy Compression for CPU Architectures'. Together they form a unique fingerprint.

Cite this