SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors

Xin Liang, Kai Zhao, Sheng Di, Sihuan Li, Robert Underwood, Ali M. Gok, Jiannan Tian, Junjing Deng, Jon C. Calhoun, Dingwen Tao, Zizhong Chen, Franck Cappello

Research output: Contribution to journalArticlepeer-review

Abstract

Today's scientific simulations require a significant reduction of data volume because of extremely large amounts of data they produce and the limited I/O bandwidth and storage space. Error-bounded lossy compression has been considered one of the most effective solutions to the above problem. In practice, however, the best-fit compression method often needs to be customized or optimized in particular because of diverse characteristics in different datasets and various user requirements on the compression quality and performance. In this paper, we address this issue with a novel modular, composable compression framework named SZ3. Our contributions are four-folds. (1) We develop SZ3 which features an innovative modular abstraction for the prediction-based compression framework, such that compression modules can be plugged in easily to create new compressors based on characteristics of data and user requirements. (2) We create a new compression pipeline by SZ3 for GAMESS data, which significantly improves the compression ratios over state-of-the-art compressors. (3) We develop an adaptive compression pipeline by SZ3 for APS data with minimal efforts, which leads to the best rate-distortion among all existing error-bounded lossy compressors for any bit-rate. (4) We compare the sustainability of SZ3 with leading error-bounded prediction-based compressors, and then demonstrate the necessity of diverse pipelines by integrating and evaluating several compression pipelines on diverse scientific datasets from multiple disciplines. Experiments show that SZ3 incurs very limited overhead in compressor integration and our customized compression pipelines lead to up to 20% improvement in compression ratios under the same data distortion, when compared with the best existing approach.

Original languageEnglish (US)
Pages (from-to)485-498
Number of pages14
JournalIEEE Transactions on Big Data
Volume9
Issue number2
DOIs
StatePublished - Apr 1 2023
Externally publishedYes

Keywords

  • Big data
  • data reduction
  • error-bounded lossy compression
  • large-scale scientific simulation

ASJC Scopus subject areas

  • Information Systems and Management
  • Information Systems

Fingerprint

Dive into the research topics of 'SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors'. Together they form a unique fingerprint.

Cite this