TY - GEN
T1 - A distributed dynamic load balancer for iterative applications
AU - Menon, Harshitha
AU - Kalé, Laxmikant
PY - 2013
Y1 - 2013
N2 - For many applications, computation load varies over time. Such applications require dynamic load balancing to improve performance. Centralized load balancing schemes, which perform the load balancing decisions at a central location, are not scalable. In contrast, fully distributed strategies are scalable but typically do not produce a balanced work dis-tribution as they tend to consider only local information. This paper describes a fully distributed algorithm for load balancing that uses partial information about the global state of the system to perform load balancing. This algo-rithm, referred to as GrapevineLB, consists of two stages: global information propagation using a lightweight algo-rithm inspired by epidemic [21] algorithms, and work unit transfer using a randomized algorithm. We provide analysis of the algorithm along with detailed simulation and perfor-mance comparison with other load balancing strategies. We demonstrate the effectiveness of GrapevineLB for adaptive mesh refinement and molecular dynamics on up to 131,072 cores of BlueGene/Q.
AB - For many applications, computation load varies over time. Such applications require dynamic load balancing to improve performance. Centralized load balancing schemes, which perform the load balancing decisions at a central location, are not scalable. In contrast, fully distributed strategies are scalable but typically do not produce a balanced work dis-tribution as they tend to consider only local information. This paper describes a fully distributed algorithm for load balancing that uses partial information about the global state of the system to perform load balancing. This algo-rithm, referred to as GrapevineLB, consists of two stages: global information propagation using a lightweight algo-rithm inspired by epidemic [21] algorithms, and work unit transfer using a randomized algorithm. We provide analysis of the algorithm along with detailed simulation and perfor-mance comparison with other load balancing strategies. We demonstrate the effectiveness of GrapevineLB for adaptive mesh refinement and molecular dynamics on up to 131,072 cores of BlueGene/Q.
KW - Distributed load balancer
KW - Epidemic algorithm
KW - Load balancing
UR - http://www.scopus.com/inward/record.url?scp=84899693808&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899693808&partnerID=8YFLogxK
U2 - 10.1145/2503210.2503284
DO - 10.1145/2503210.2503284
M3 - Conference contribution
AN - SCOPUS:84899693808
SN - 9781450323789
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2013
PB - IEEE Computer Society
T2 - 2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
Y2 - 17 November 2013 through 22 November 2013
ER -