MTC envelope: Defining the capability of large scale computers in the context of parallel scripting applications

Zhao Zhang, Daniel S. Katz, Michael Wilde, Justin M. Wozniak, Ian Foster

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many scientific applications can be efficiently expressed with the parallel scripting (many-task computing, MTC) paradigm. These applications are typically composed of several stages of computation, with tasks in different stages coupled by a shared file system abstraction. However, we often see poor performance when running these applications on large scale computers due to the applications' frequency and volume of filesystem I/O and the absence of appropriate optimizations in the context of parallel scripting applications. In this paper, we show the capability of existing large scale computers to run parallel scripting applications by first defining the MTC envelope and then evaluating the envelope by benchmarking a suite of shared filesystem performance metrics. We also seek to determine the origin of the performance bottleneck by profiling the parallel scripting applications' I/O behavior and mapping the I/O operations to the MTC envelope. We show an example shared filesystem envelope and present a method to predict the I/O performance given the applications' level of I/O concurrency and I/O amount. This work is instrumental in guiding the development of parallel scripting applications to make efficient use of existing large scale computers, and to evaluate performance improvements in the hardware/software stack that will better facilitate parallel scripting applications.

Original languageEnglish (US)
Title of host publicationHPDC 2013 - Proceedings of the 22nd ACM International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery
Pages37-48
Number of pages12
ISBN (Print)9781450319102
DOIs
StatePublished - 2013
Externally publishedYes
Event22nd ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2013 - New York, NY, United States
Duration: Jun 17 2013Jun 21 2013

Publication series

NameHPDC 2013 - Proceedings of the 22nd ACM International Symposium on High-Performance Parallel and Distributed Computing

Other

Other22nd ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2013
Country/TerritoryUnited States
CityNew York, NY
Period6/17/136/21/13

Keywords

  • distributed file system
  • MTC
  • parallel scripting application
  • performance measurements

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'MTC envelope: Defining the capability of large scale computers in the context of parallel scripting applications'. Together they form a unique fingerprint.

Cite this