TY - GEN
T1 - A hierarchical approach to reducing communication in parallel graph algorithms
AU - Harshvardhan,
AU - Amato, Nancy Marie
AU - Rauchwerger, Lawrence
PY - 2015/1/24
Y1 - 2015/1/24
N2 - Large-scale graph computing has become critical due to the ever-increasing size of data. However, distributed graph computations are limited in their scalability and performance due to the heavy communication inherent in such computations. This is exacerbated in scale-free networks, such as social and web graphs, which contain hub vertices that have large degrees and therefore send a large number of messages over the network. Furthermore, many graph algorithms and computations send the same data to each of the neighbors of a vertex. Our proposed approach recognizes this, and reduces communication performed by the algorithm without change to user-code, through a hierarchical machine model imposed upon the input graph. The hierarchical model takes advantage of locale information of the neighboring vertices to reduce communication, both in message volume and total number of bytes sent. It is also able to better exploit the machine hierarchy to further reduce the communication costs, by aggregating traffic between different levels of the machine hierarchy. Results of an implementation in the STAPL GL shows improved scalability and performance over the traditional level-synchronous approach, with 2.5 × - 8× improvement for a variety of graph algorithms at 12, 000+ cores.
AB - Large-scale graph computing has become critical due to the ever-increasing size of data. However, distributed graph computations are limited in their scalability and performance due to the heavy communication inherent in such computations. This is exacerbated in scale-free networks, such as social and web graphs, which contain hub vertices that have large degrees and therefore send a large number of messages over the network. Furthermore, many graph algorithms and computations send the same data to each of the neighbors of a vertex. Our proposed approach recognizes this, and reduces communication performed by the algorithm without change to user-code, through a hierarchical machine model imposed upon the input graph. The hierarchical model takes advantage of locale information of the neighboring vertices to reduce communication, both in message volume and total number of bytes sent. It is also able to better exploit the machine hierarchy to further reduce the communication costs, by aggregating traffic between different levels of the machine hierarchy. Results of an implementation in the STAPL GL shows improved scalability and performance over the traditional level-synchronous approach, with 2.5 × - 8× improvement for a variety of graph algorithms at 12, 000+ cores.
KW - Big Data
KW - Distributed computing
KW - Graph analytics
KW - Parallel graph processing
UR - http://www.scopus.com/inward/record.url?scp=84939208796&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84939208796&partnerID=8YFLogxK
U2 - 10.1145/2688500.2700994
DO - 10.1145/2688500.2700994
M3 - Conference contribution
AN - SCOPUS:84939208796
T3 - Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
SP - 285
EP - 286
BT - 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015 - Proceedings
PB - Association for Computing Machinery
T2 - 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015
Y2 - 7 February 2015 through 11 February 2015
ER -