Scheduling the I/O of HPC Applications under Congestion

Ana Gainaru, Guillaume Aupy, Anne Benoit, Franck Cappello, Yves Robert, Marc Snir

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A significant percentage of the computing capacity of large-scale platforms is wasted because of interferences incurred by multiple applications that access a shared parallel file system concurrently. One solution to handling I/O bursts enlarge-scale HPC systems is to absorb them at an intermediate storage layer consisting of burst buffers. However, our analysis of the Argonne's Mira system shows that burst buffers cannot prevent congestion at all times. Consequently, I/O performances dramatically degraded, showing in some cases a decrease in I/O throughput of 67%. In this paper, we analyze the effects of interference on application I/O bandwidth and propose several scheduling techniques to mitigate congestion. We show through extensive experiments that our global I/O scheduler is able to reduce the effects of congestion, even on systems where burst buffers are used, and can increase the overall system throughput up to 56%. We also show that it outperforms current Mira I/O schedulers.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1013-1022
Number of pages10
ISBN (Electronic)9781479986484
DOIs
StatePublished - Jul 17 2015
Event29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015 - Hyderabad, India
Duration: May 25 2015May 29 2015

Publication series

NameProceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Other

Other29th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015
Country/TerritoryIndia
CityHyderabad
Period5/25/155/29/15

Keywords

  • HPC application performance
  • I/O congestion
  • I/O scheduler
  • burst buffers

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Scheduling the I/O of HPC Applications under Congestion'. Together they form a unique fingerprint.

Cite this