HPC-colony: Services and interfaces for very large systems

Sayantan Chakravorty, Celso L. Mendes, Laxmikant V. Kalé, Terry Jones, Andrew Tauferner, Todd Inglett, José Moreira

Research output: Contribution to journalArticlepeer-review

Abstract

Traditional full-featured operating systems are known to have properties that limit the scalability of distributed memory parallel programs, the most common programming paradigm utilized in high end computing. Furthermore, as processor counts increase with the most capable systems, the necessary activity to manage the system becomes more of a burden. To make a general purpose operating system scale to such levels, new technology is required for parallel resource management and global system management (including fault management). In this paper, we describe the shortcomings of full-featured operating systems and runtime systems and discuss an approach to scale such systems to one hundred thousand processors with both scalable parallel application performance and efficient system management.

Original languageEnglish (US)
Pages (from-to)43-49
Number of pages7
JournalOperating Systems Review (ACM)
Volume40
Issue number2
DOIs
StatePublished - Apr 2006

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'HPC-colony: Services and interfaces for very large systems'. Together they form a unique fingerprint.

Cite this