Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation

Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Bogdan Nicolae, Zizhong Chen, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Because of the ever-increasing data being produced by today's high performance computing (HPC) scientific simulations, I/O performance is becoming a significant bottleneck for their executions. An efficient error-controlled lossy compressor is a promising solution to significantly reduce data writing time for scientific simulations running on supercomputers. In this paper, we explore how to optimize the data dumping performance for scientific simulation by leveraging error-bounded lossy compression techniques. The contributions of the paper are threefold. (1) We propose a novel I/O performance profiling model that can effectively represent the I/O performance with different execution scales and data sizes, and optimize the estimation accuracy of data dumping performance using least square method. (2) We develop an adaptive lossy compression framework that can select the bestfit compressor (between two leading lossy compressors SZ and ZFP) with optimized parameter settings with respect to overall data dumping performance. (3) We evaluate our adaptive lossy compression framework with up to 32k cores on a supercomputer facilitated with fast I/O systems and using real-world scientific simulation datasets. Experiments show that our solution can mostly always lead the data dumping performance to the optimal level with very accurate selection of the bestfit lossy compressor and settings. The data dumping performance can be improved by up to 27% at different scales.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Cluster Computing, CLUSTER 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728147345
DOIs
StatePublished - Sep 2019
Event2019 IEEE International Conference on Cluster Computing, CLUSTER 2019 - Albuquerque, United States
Duration: Sep 23 2019Sep 26 2019

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2019-September
ISSN (Print)1552-5244

Conference

Conference2019 IEEE International Conference on Cluster Computing, CLUSTER 2019
Country/TerritoryUnited States
CityAlbuquerque
Period9/23/199/26/19

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Signal Processing

Fingerprint

Dive into the research topics of 'Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation'. Together they form a unique fingerprint.

Cite this