A scalable priority-aware approach to managing data center server power

Yang Li, Charles R. Lefurgy, Karthick Rajamani, Malcolm S. Allen-Ware, Guillermo J. Silva, Daniel D. Heimsoth, Saugata Ghose, Onur Mutlu

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Power management is a key component of modern data center design. Power managers must (1) ensure the cost- A nd energy-efficient utilization of the data center infrastructure, (2) maintain availability of the services provided by the center, and (3) address environmental concerns associated with the center's power consumption. While several power management techniques have been proposed and deployed in production data centers, there are still many challenges to comprehensive data center power management. This is particularly true in public cloud environments, where different jobs have different priority levels, and where high availability is critical. One example of the challenges facing public cloud data centers involves power capping. As power delivery must be highly reliable and tolerate wide variation in the load drawn by the data center components, the power infrastructure (e.g., power supplies, circuit breakers, UPS) has high redundancy and overprovisioning. During normal operation (i.e., typical server power demands, and no failures in the center), the power infrastructure is significantly underutilized. Power capping is a common solution to reduce this underutilization, by allowing more servers to be added safely (i.e., without power shortfalls) to the existing power infrastructure, and throttling power consumption in the infrequent cases where the demanded power exceeds the provisioned power capacity to avoid shortfalls. However, state-of-the-art power capping solutions are (1) not directly applicable to the redundant power infrastructure used in highly-available data centers; and (2) oblivious to differing workload priorities across the entire center when power consumption needs to be throttled, which can unnecessarily slow down high-priority work. To address this need, we develop CapMaestro, a new power management architecture with three key features for public cloud data centers. First, CapMaestro is designed to work with multiple power feeds (i.e., sources), and exploits server-level power capping to independently cap the load on each feed of a server. Second, CapMaestro uses a scalable, global priority-aware power capping approach, which accounts for power capacity at each level of the power distribution hierarchy. It exploits the underutilization of commonly-employed redundant power infrastructure at each level of the hierarchy to safely accommodate a much greater number of servers. Third, CapMaestro exploits stranded power (i.e., power budgets that are not utilized) in redundant power infrastructure to boost the performance of workloads in the data center. We add CapMaestro to a real cloud data center control plane, and demonstrate the effectiveness of all three key features. Using a large-scale data center simulation, we demonstrate that CapMaestro significantly and safely increases the number of servers for existing infrastructure. We also call out other key technical challenges the industry faces in data center power management.

Original languageEnglish (US)
Title of host publicationProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages14
ISBN (Electronic)9781728114446
StatePublished - Mar 26 2019
Externally publishedYes
Event25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 - Washington, United States
Duration: Feb 16 2019Feb 20 2019

Publication series

NameProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019


Conference25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
CountryUnited States


  • Data center
  • Power capping
  • Priority aware resource allocation
  • Public cloud

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'A scalable priority-aware approach to managing data center server power'. Together they form a unique fingerprint.

Cite this