Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data

Xiangyu Zou, Tao Lu, Wen Xia, Xuan Wang, Weizhe Zhang, Haijun Zhang, Sheng Di, Dingwen Tao, Franck Cappello

Research output: Contribution to journalArticlepeer-review

Abstract

Scientific simulations in high-performance computing (HPC) environments generate vast volume of data, which may cause a severe I/O bottleneck at runtime and a huge burden on storage space for postanalysis. Unlike traditional data reduction schemes such as deduplication or lossless compression, not only can error-controlled lossy compression significantly reduce the data size but it also holds the promise to satisfy user demand on error control. Pointwise relative error bounds (i.e., compression errors depends on the data values) are widely used by many scientific applications with lossy compression since error control can adapt to the error bound in the dataset automatically. Pointwise relative-error-bounded compression is complicated and time consuming. In this article, we develop efficient precomputation-based mechanisms based on the SZ lossy compression framework. Our mechanisms can avoid costly logarithmic transformation and identify quantization factor values via a fast table lookup, greatly accelerating the relative-error-bounded compression with excellent compression ratios. In addition, we reduce traversing operations for Huffman decoding, significantly accelerating the decompression process in SZ. Experiments with eight well-known real-world scientific simulation datasets show that our solution can improve the compression and decompression rates (i.e., the speed) by about 40 and 80 p, respectively, in most of cases, making our designed lossy compression strategy the best-in-class solution in most cases.

Original languageEnglish (US)
Article number8989806
Pages (from-to)1665-1680
Number of pages16
JournalIEEE Transactions on Parallel and Distributed Systems
Volume31
Issue number7
DOIs
StatePublished - Jul 1 2020
Externally publishedYes

Keywords

  • Lossy compression
  • compression rate
  • high-performance computing
  • scientific data

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data'. Together they form a unique fingerprint.

Cite this