TY - GEN
T1 - Damaris
T2 - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012
AU - Dorier, Matthieu
AU - Antoniu, Gabriel
AU - Cappello, Franck
AU - Snir, Marc
AU - Orf, Leigh
PY - 2012
Y1 - 2012
N2 - With exascale computing on the horizon, the performance variability of I/O systems represents a key challenge in sustaining high performance. In many HPC applications, I/O is concurrently performed by all processes, which leads to I/O bursts. This causes resource contention and substantial variability of I/O performance, which significantly impacts the overall application performance and, most importantly, its predictability over time. In this paper, we propose a new approach to I/O, called Damaris, which leverages dedicated I/O cores on each multicore SMP node, along with the use of shared-memory, to efficiently perform asynchronous data processing and I/O in order to hide this variability. We evaluate our approach on three different platforms including the Kraken Cray XT5 supercomputer (ranked 11th in Top500), with the CM1 atmospheric model, one of the target HPC applications for the Blue Waters postpetascale supercomputer project. By overlapping I/O with computation and by gathering data into large files while avoiding synchronization between cores, our solution brings several benefits: 1) it fully hides jitter as well as all I/O-related costs, which makes simulation performance predictable, 2) it increases the sustained write throughput by a factor of 15 compared to standard approaches, 3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches which fail to scale, 4) it enables a 600% compression ratio without any additional overhead, leading to a major reduction of storage requirements.
AB - With exascale computing on the horizon, the performance variability of I/O systems represents a key challenge in sustaining high performance. In many HPC applications, I/O is concurrently performed by all processes, which leads to I/O bursts. This causes resource contention and substantial variability of I/O performance, which significantly impacts the overall application performance and, most importantly, its predictability over time. In this paper, we propose a new approach to I/O, called Damaris, which leverages dedicated I/O cores on each multicore SMP node, along with the use of shared-memory, to efficiently perform asynchronous data processing and I/O in order to hide this variability. We evaluate our approach on three different platforms including the Kraken Cray XT5 supercomputer (ranked 11th in Top500), with the CM1 atmospheric model, one of the target HPC applications for the Blue Waters postpetascale supercomputer project. By overlapping I/O with computation and by gathering data into large files while avoiding synchronization between cores, our solution brings several benefits: 1) it fully hides jitter as well as all I/O-related costs, which makes simulation performance predictable, 2) it increases the sustained write throughput by a factor of 15 compared to standard approaches, 3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches which fail to scale, 4) it enables a 600% compression ratio without any additional overhead, leading to a major reduction of storage requirements.
KW - Dedicated Cores
KW - Exascale Computing
KW - I/O
KW - Multicore Architectures
KW - Variability
UR - http://www.scopus.com/inward/record.url?scp=84870695350&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870695350&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2012.26
DO - 10.1109/CLUSTER.2012.26
M3 - Conference contribution
AN - SCOPUS:84870695350
SN - 9780768548074
T3 - Proceedings - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012
SP - 155
EP - 163
BT - Proceedings - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012
PB - IEEE Computer Society
Y2 - 24 September 2012 through 28 September 2012
ER -