TY - JOUR
T1 - Damaris
T2 - Addressing performance variability in data management for post-petascale simulations
AU - Dorier, Matthieu
AU - Antoniu, Gabriel
AU - Cappello, Franck
AU - Snir, Marc
AU - Sisneros, Robert
AU - Yildiz, Orcun
AU - Ibrahim, Shadi
AU - Peterka, Tom
AU - Orf, Leigh
N1 - Funding Information:
This work was done in the framework of a collaboration between the KerData (Inria Rennes Bretagne Atlantique, ENS Rennes, INSA Rennes, IRISA) team, the National Center for Supercomputing Applications (Urbana-Champaign, USA), and Argonne National Laboratory, within the Joint Inria-UIUC-ANL-BSC-JSC Laboratory for Extreme-Scale Computing (JLESC), formerly Joint Laboratory for Petascale Computing (JLPC). The material was based on work supported by the U.S. Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357; by the National Center for Atmospheric Research (NCAR); and by Central Michigan University. Some experiments presented in this article were carried out on the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER, and several universities as well as other organizations (see https://www.grid5000.fr).
Publisher Copyright:
© 2016 ACM.
PY - 2016/12
Y1 - 2016/12
N2 - With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. This variability significantly impacts the overall application performance at scale and its predictability over time. In this article, we present Damaris, a system that leverages dedicated cores in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of datamovements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS's Kraken and NCSA's Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-The-Art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability. In addition, we extended our implementation of Damaris to also support the use of dedicated nodes and conducted a thorough comparison of the two approaches-dedicated cores and dedicated nodes-for I/O tasks with the aforementioned applications.
AB - With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. This variability significantly impacts the overall application performance at scale and its predictability over time. In this article, we present Damaris, a system that leverages dedicated cores in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of datamovements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS's Kraken and NCSA's Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-The-Art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability. In addition, we extended our implementation of Damaris to also support the use of dedicated nodes and conducted a thorough comparison of the two approaches-dedicated cores and dedicated nodes-for I/O tasks with the aforementioned applications.
KW - Damaris
KW - Dedicated cores
KW - Dedicated nodes
KW - Exascale computing
KW - I/O
KW - In situ visualization
UR - http://www.scopus.com/inward/record.url?scp=85048537754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048537754&partnerID=8YFLogxK
U2 - 10.1145/2987371
DO - 10.1145/2987371
M3 - Article
AN - SCOPUS:85048537754
VL - 3
SP - 1
EP - 43
JO - ACM Transactions on Parallel Computing
JF - ACM Transactions on Parallel Computing
SN - 2329-4949
IS - 3
ER -