Optimizing error-bounded lossy compression for scientific data by dynamic spline interpolation

Kai Zhao, Sheng Di, Maxim Dmitriev, Thierry Laurent D. Tonellot, Zizhong Chen, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Today's scientific simulations are producing vast volumes of data that cannot be stored and transferred efficiently because of limited storage capacity, parallel I/O bandwidth, and network bandwidth. The situation is getting worse over time because of the ever-increasing gap between relatively slow data transfer speed and fast-growing computation power in modern supercomputers. Error-bounded lossy compression is becoming one of the most critical techniques for resolving the big scientific data issue, in that it can significantly reduce the scientific data volume while guaranteeing that the reconstructed data is valid for users because of its compression-error-bounding feature. In this paper, we present a novel error-bounded lossy compressor based on a state-of-the-art prediction-based compression framework. Our solution exhibits substantially better compression quality than all of the existing error-bounded lossy compressors, with comparable compression speed. Specifically, our contribution is threefold. (1) We provide an in-depth analysis of why the best-existing prediction-based lossy compressor can only minimally improve the compression quality. (2) We propose a dynamic spline interpolation approach with a series of optimization strategies that can significantly improve the data prediction accuracy, substantially improving the compression quality in turn. (3) We perform a thorough evaluation using six real-world scientific simulation datasets across different science domains to evaluate our solution vs. all other related works. Experiments show that the compression ratio of our solution is higher than that of the second-best lossy compressor by 20% 460% with the same error bound in most of the cases. ∼

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
PublisherIEEE Computer Society
Pages1643-1654
Number of pages12
ISBN (Electronic)9781728191843
DOIs
StatePublished - Apr 2021
Event37th IEEE International Conference on Data Engineering, ICDE 2021 - Virtual, Chania, Greece
Duration: Apr 19 2021Apr 22 2021

Publication series

NameProceedings - International Conference on Data Engineering
Volume2021-April
ISSN (Print)1084-4627

Conference

Conference37th IEEE International Conference on Data Engineering, ICDE 2021
Country/TerritoryGreece
CityVirtual, Chania
Period4/19/214/22/21

Keywords

  • Data compression
  • Data reduction
  • Lossy compressor
  • Scientific data management

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Optimizing error-bounded lossy compression for scientific data by dynamic spline interpolation'. Together they form a unique fingerprint.

Cite this