Improving HPC application performance in cloud through dynamic load balancing

Abhishek Gupta, Osman Sarood, Laxmikant V Kale, Dejan Milojicic

Research output: Contribution to conferencePaper

Abstract

Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poorinterconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire application, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightly-coupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques.

Original languageEnglish (US)
Pages402-409
Number of pages8
DOIs
StatePublished - Aug 14 2013
Event13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013 - Delft, Netherlands
Duration: May 13 2013May 16 2013

Other

Other13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013
CountryNetherlands
CityDelft
Period5/13/135/16/13

Fingerprint

Dynamic loads
Resource allocation
Supercomputers
Cloud computing
Program processors
Scalability
Elasticity
Decomposition
Hardware
Monitoring
Virtual machine

Keywords

  • Cloud
  • High Performance Computing
  • Load-balance

ASJC Scopus subject areas

  • Software

Cite this

Gupta, A., Sarood, O., Kale, L. V., & Milojicic, D. (2013). Improving HPC application performance in cloud through dynamic load balancing. 402-409. Paper presented at 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands. https://doi.org/10.1109/CCGrid.2013.65

Improving HPC application performance in cloud through dynamic load balancing. / Gupta, Abhishek; Sarood, Osman; Kale, Laxmikant V; Milojicic, Dejan.

2013. 402-409 Paper presented at 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands.

Research output: Contribution to conferencePaper

Gupta, A, Sarood, O, Kale, LV & Milojicic, D 2013, 'Improving HPC application performance in cloud through dynamic load balancing', Paper presented at 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands, 5/13/13 - 5/16/13 pp. 402-409. https://doi.org/10.1109/CCGrid.2013.65
Gupta A, Sarood O, Kale LV, Milojicic D. Improving HPC application performance in cloud through dynamic load balancing. 2013. Paper presented at 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands. https://doi.org/10.1109/CCGrid.2013.65
Gupta, Abhishek ; Sarood, Osman ; Kale, Laxmikant V ; Milojicic, Dejan. / Improving HPC application performance in cloud through dynamic load balancing. Paper presented at 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands.8 p.
@conference{0e86ddd94aa3499abb65b7311084ac61,
title = "Improving HPC application performance in cloud through dynamic load balancing",
abstract = "Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poorinterconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire application, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightly-coupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45{\%}. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques.",
keywords = "Cloud, High Performance Computing, Load-balance",
author = "Abhishek Gupta and Osman Sarood and Kale, {Laxmikant V} and Dejan Milojicic",
year = "2013",
month = "8",
day = "14",
doi = "10.1109/CCGrid.2013.65",
language = "English (US)",
pages = "402--409",
note = "13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013 ; Conference date: 13-05-2013 Through 16-05-2013",

}

TY - CONF

T1 - Improving HPC application performance in cloud through dynamic load balancing

AU - Gupta, Abhishek

AU - Sarood, Osman

AU - Kale, Laxmikant V

AU - Milojicic, Dejan

PY - 2013/8/14

Y1 - 2013/8/14

N2 - Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poorinterconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire application, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightly-coupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques.

AB - Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poorinterconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire application, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightly-coupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques.

KW - Cloud

KW - High Performance Computing

KW - Load-balance

UR - http://www.scopus.com/inward/record.url?scp=84881307424&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84881307424&partnerID=8YFLogxK

U2 - 10.1109/CCGrid.2013.65

DO - 10.1109/CCGrid.2013.65

M3 - Paper

AN - SCOPUS:84881307424

SP - 402

EP - 409

ER -