Improving HPC application performance in cloud through dynamic load balancing

Abhishek Gupta, Osman Sarood, Laxmikant V. Kale, Dejan Milojicic

Research output: Contribution to conferencePaper

Abstract

Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poorinterconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire application, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightly-coupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques.

Original languageEnglish (US)
Pages402-409
Number of pages8
DOIs
StatePublished - Aug 14 2013
Event13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013 - Delft, Netherlands
Duration: May 13 2013May 16 2013

Other

Other13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013
CountryNetherlands
CityDelft
Period5/13/135/16/13

Keywords

  • Cloud
  • High Performance Computing
  • Load-balance

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Improving HPC application performance in cloud through dynamic load balancing'. Together they form a unique fingerprint.

Cite this