TY - GEN
T1 - Exploring Wavelet Transform Usages for Error-bounded Scientific Data Compression
AU - Huang, Jiajun
AU - Liu, Jinyang
AU - Di, Sheng
AU - Zhai, Yujia
AU - Jian, Zizhe
AU - Wu, Shixun
AU - Zhao, Kai
AU - Chen, Zizhong
AU - Guo, Yanfei
AU - Cappello, Franck
N1 - Foundation under Grant OAC-2003709, OAC-2104023, OAC-2311875, OAC-2311877, and OAC-2153451. We acknowledge the computing resources provided on Bebop (operated by Laboratory Computing Resource Center at Argonne) and on Theta and JLSE (operated by Argonne Leadership Computing Facility).
ACKNOWLEDGMENTS This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR), under contract DE-AC02-06CH11357, and supported by the National Science
PY - 2023
Y1 - 2023
N2 - To address the challenges raised by the data management of exascale scientific data, error-bounded lossy compression has been proposed and well-researched as a prominent solution. Among the existing works, a recent trend leverages wavelet transforms in the error-bounded lossy compression task to effectively capture long-term data correlations within the inputs. Applying those transforms as data preprocessors and decorrelators, wavelet-based lossy compressors have achieved optimized compression rate-distortion on several datasets. However, certain significant limitations of wavelet-based compressors have also been observed: On one hand, attributed to the high computational cost of wavelet transforms, wavelet-based compressors suffer from relatively low computational efficiencies compared to other state-of-the-art compressors. On the other hand, one certain type of wavelet transform cannot perform well on all variations of scientific data. Consequently, to further fine-tune the wavelet-based scientific data lossy compression, more in-depth and systematic research and analysis needs to be conducted. In this paper, based on the FAZ auto-tuning-based modular compression framework, we have integrated a great number of wavelet transforms into the framework and evaluated them with various real-world scientific datasets and fields. From the analysis of those evaluations and the comparison to existing state-of-the-art wavelet-based and non-wavelet-based error-bounded lossy compressors, we conclude and present several essential takeaways for designing and optimizing the wavelet-based scientific error-bounded lossy compressor.
AB - To address the challenges raised by the data management of exascale scientific data, error-bounded lossy compression has been proposed and well-researched as a prominent solution. Among the existing works, a recent trend leverages wavelet transforms in the error-bounded lossy compression task to effectively capture long-term data correlations within the inputs. Applying those transforms as data preprocessors and decorrelators, wavelet-based lossy compressors have achieved optimized compression rate-distortion on several datasets. However, certain significant limitations of wavelet-based compressors have also been observed: On one hand, attributed to the high computational cost of wavelet transforms, wavelet-based compressors suffer from relatively low computational efficiencies compared to other state-of-the-art compressors. On the other hand, one certain type of wavelet transform cannot perform well on all variations of scientific data. Consequently, to further fine-tune the wavelet-based scientific data lossy compression, more in-depth and systematic research and analysis needs to be conducted. In this paper, based on the FAZ auto-tuning-based modular compression framework, we have integrated a great number of wavelet transforms into the framework and evaluated them with various real-world scientific datasets and fields. From the analysis of those evaluations and the comparison to existing state-of-the-art wavelet-based and non-wavelet-based error-bounded lossy compressors, we conclude and present several essential takeaways for designing and optimizing the wavelet-based scientific error-bounded lossy compressor.
KW - error-bounded lossy compression
KW - scientific datasets
KW - wavelet transform
UR - http://www.scopus.com/inward/record.url?scp=85184976928&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184976928&partnerID=8YFLogxK
U2 - 10.1109/BigData59044.2023.10386386
DO - 10.1109/BigData59044.2023.10386386
M3 - Conference contribution
AN - SCOPUS:85184976928
T3 - Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
SP - 4233
EP - 4239
BT - Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
A2 - He, Jingrui
A2 - Palpanas, Themis
A2 - Hu, Xiaohua
A2 - Cuzzocrea, Alfredo
A2 - Dou, Dejing
A2 - Slezak, Dominik
A2 - Wang, Wei
A2 - Gruca, Aleksandra
A2 - Lin, Jerry Chun-Wei
A2 - Agrawal, Rakesh
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE International Conference on Big Data, BigData 2023
Y2 - 15 December 2023 through 18 December 2023
ER -