For many applications, computation load varies over time. Such applications require dynamic load balancing to improve performance. Centralized load balancing schemes, which perform the load balancing decisions at a central location, are not scalable. In contrast, fully distributed strategies are scalable but typically do not produce a balanced work dis-tribution as they tend to consider only local information. This paper describes a fully distributed algorithm for load balancing that uses partial information about the global state of the system to perform load balancing. This algo-rithm, referred to as GrapevineLB, consists of two stages: global information propagation using a lightweight algo-rithm inspired by epidemic  algorithms, and work unit transfer using a randomized algorithm. We provide analysis of the algorithm along with detailed simulation and perfor-mance comparison with other load balancing strategies. We demonstrate the effectiveness of GrapevineLB for adaptive mesh refinement and molecular dynamics on up to 131,072 cores of BlueGene/Q.