Fine-grained Policy-driven I/O Sharing for Burst Buffers

Ed Karrels, Lei Huang, Yuhong Kan, Ishank Arora, Yinzhi Wang, Daniel S. Katz, William Gropp, Zhao Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A burst buffer is a common method to bridge the performance gap between the I/O needs of modern supercomputing applications and the performance of the shared file system on large-scale supercomputers. However, existing I/O sharing methods require resource isolation, offline profiling, or repeated execution that significantly limit the utilization and applicability of these systems. Here we present ThemisIO, a policy-driven I/O sharing framework for a remote-shared burst buffer: a dedicated group of I/O nodes, each with a local storage device. ThemisIO preserves high utilization by implementing opportunity fairness so that it can reallocate unused I/O resources to other applications. ThemisIO accurately and efficiently allocates I/O cycles among applications, purely based on real-time I/O behavior without requiring user-supplied information or offline-profiled application characteristics. ThemisIO supports a variety of fair sharing policies, such as user-fair, size-fair, as well as composite policies, e.g., group-then-user-fair. All these features are enabled by its statistical token design. ThemisIO can alter the execution order of incoming I/O requests based on assigned tokens to precisely balance I/O cycles between applications via time slicing, thereby enforcing processing isolation. Experiments using I/O benchmarks show that ThemisIO sustains 13.5 - 13.7% higher I/O throughput and 19.5 - 40.4% lower performance variation than existing algorithms. For real applications, ThemisIO significantly reduces the slowdown by 59.1 - 99.8% caused by I/O interference.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400701092
DOIs
StatePublished - Nov 12 2023
Event2023 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023 - Denver, United States
Duration: Nov 12 2023Nov 17 2023

Publication series

NameProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023

Conference

Conference2023 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023
Country/TerritoryUnited States
CityDenver
Period11/12/2311/17/23

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Fine-grained Policy-driven I/O Sharing for Burst Buffers'. Together they form a unique fingerprint.

Cite this