TY - GEN
T1 - Fulfilling the promises of Lossy compression for scientific applications
AU - Cappello, Franck
AU - Di, Sheng
AU - Gok, Ali Murat
N1 - Funding Information:
Acknowledgments. The co-authors wish to thank (in alphabetical order): Mark Ainsworth, Julie Bessac, Jon Calhoun, Ozan Tugluk and Robert Underwood for the fruitfull discussions within the ECP CODAR project. This research was supported by the ECP, Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations – the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation’s exascale computing imperative. The material was based upon work supported by the DOE, Office of Science, under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant No. 1763540, Grant No. 1617488 and Grant No. 2003709. We acknowledge the computing resources provided on Bebop, which is operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This research also used computing resources of the Argonne Leadership Computing Facility.
Funding Information:
The co-authors wish to thank (in alphabetical order): Mark Ainsworth, Julie Bessac, Jon Calhoun, Ozan Tugluk and Robert Underwood for the fruitfull discussions within the ECP CODAR project. This research was supported by the ECP, Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations - the Office of Science and the National Nuclear Security Administration, responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, to support the nation?s exascale computing imperative. The material was based upon work supported by the DOE, Office of Science, under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant No. 1763540, Grant No. 1617488 and Grant No. 2003709. We acknowledge the computing resources provided on Bebop, which is operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This research also used computing resources of the Argonne Leadership Computing Facility.
Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2021
Y1 - 2021
N2 - Many scientific simulations, machine/deep learning applications and instruments are in need of significant data reduction. Errorbounded lossy compression has been identified as one solution and has been tested for many use-cases: Reducing streaming intensity (instruments), reducing storage and memory footprints, accelerating computation and accelerating data access and transfer. Ultimately, users’ trust in lossy compression relies on the preservation of science: same conclusions should be drawn from computations or analysis done from lossy compressed data. Experience from scientific simulations, Artificial Intelligence (AI) and instruments reveals several points: (i) there are important gaps in the understanding of the effects of lossy compressed data on computations, AI and analysis, (ii) each use-case, application and user has its own requirements in terms of compression ratio, speed and accuracy, and current generic monolithic compressors are not responding well to this need for specialization. This situation calls for more research and development on the lossy compression technologies. This paper addresses the most pressing research needs regarding the application of lossy compression in the scientific context.
AB - Many scientific simulations, machine/deep learning applications and instruments are in need of significant data reduction. Errorbounded lossy compression has been identified as one solution and has been tested for many use-cases: Reducing streaming intensity (instruments), reducing storage and memory footprints, accelerating computation and accelerating data access and transfer. Ultimately, users’ trust in lossy compression relies on the preservation of science: same conclusions should be drawn from computations or analysis done from lossy compressed data. Experience from scientific simulations, Artificial Intelligence (AI) and instruments reveals several points: (i) there are important gaps in the understanding of the effects of lossy compressed data on computations, AI and analysis, (ii) each use-case, application and user has its own requirements in terms of compression ratio, speed and accuracy, and current generic monolithic compressors are not responding well to this need for specialization. This situation calls for more research and development on the lossy compression technologies. This paper addresses the most pressing research needs regarding the application of lossy compression in the scientific context.
KW - Lossy compression
KW - Scientific data
UR - http://www.scopus.com/inward/record.url?scp=85102795619&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102795619&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-63393-6_7
DO - 10.1007/978-3-030-63393-6_7
M3 - Conference contribution
AN - SCOPUS:85102795619
SN - 9783030633929
T3 - Communications in Computer and Information Science
SP - 99
EP - 116
BT - Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI - 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020, Revised Selected Papers
A2 - Nichols, Jeffrey
A2 - Maccabe, Arthur ‘Barney’
A2 - Parete-Koon, Suzanne
A2 - Verastegui, Becky
A2 - Hernandez, Oscar
A2 - Ahearn, Theresa
PB - Springer
T2 - 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020
Y2 - 26 August 2020 through 28 August 2020
ER -