Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms

Benjamin Reidys, Pantea Zardoshti, Íñigo Goiri, Celine Irvene, Daniel S. Berger, Haoran Ma, Kapil Arya, Eli Cortez, Taylor Stark, Eugene Bak, Mehmet Iyigun, Stanko Novakovic, Lisa Hsu, Karel Trueba, Abhisek Pan, Chetan Bansal, Saravan Rajmohan, Jian Huang, Ricardo Bianchini

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cloud platforms remain underutilized despite multiple proposals to improve their utilization (e.g., disaggregation, harvesting, and oversubscription). Our characterization of the resource utilization of virtual machines (VMs) in Azure reveals that, while CPU is the main underutilized resource, we need to provide a solution to manage all resources holistically. We also observe that many VMs exhibit complementary temporal patterns, which can be leveraged to improve the oversubscription of underutilized resources. Based on these insights, we propose Coach: a system that exploits temporal patterns for all-resource oversubscription in cloud platforms. Coach uses long-term predictions and an efficient VM scheduling policy to exploit temporally complementary patterns. We introduce a new general-purpose VM type, called CoachVM, where we partition each resource allocation into a guaranteed and an oversubscribed portion. Coach monitors the oversubscribed resources to detect contention and mitigate any potential performance degradation. We focus on memory management, which is particularly challenging due to memory's sensitivity to contention and the overhead required to reassign it between CoachVMs. Our experiments show that Coach enables platforms to host up to ∼26% more VMs with minimal performance degradation.

Original languageEnglish (US)
Title of host publicationASPLOS 2025 - Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
PublisherAssociation for Computing Machinery
Pages164-181
Number of pages18
ISBN (Electronic)9798400706981
DOIs
StatePublished - Mar 30 2025
Event30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025 - Rotterdam, Netherlands
Duration: Mar 30 2025Apr 3 2025

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
Volume1

Conference

Conference30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025
Country/TerritoryNetherlands
CityRotterdam
Period3/30/254/3/25

Keywords

  • cloud computing
  • memory oversubscription
  • resource management
  • temporal patterns

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms'. Together they form a unique fingerprint.

Cite this