TY - GEN
T1 - Betty
T2 - 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2023
AU - Yang, Shuangyan
AU - Zhang, Minjia
AU - Dong, Wenqian
AU - Li, Dong
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/1/27
Y1 - 2023/1/27
N2 - The Graph Neural Network (GNN) is showing outstanding results in improving the performance of graph-based applications. Recent studies demonstrate that GNN performance can be boosted via using more advanced aggregators, deeper aggregation depth, larger sampling rate, etc. While leading to promising results, the improvements come at a cost of significantly increased memory footprint, easily exceeding GPU memory capacity. In this paper, we introduce a method, Betty, to make GNN training more scalable and accessible via batch-level partitioning. Different from DNN training, a mini-batch in GNN has complex dependencies between input features and output labels, making batch-level partitioning difficult. Betty introduces two noveltechniques, redundancy-embedded graph (REG) partitioning and memory-aware partitioning, to effectively mitigate the redundancy and load imbalances issues across the partitions. Our evaluation of large-scale real-world datasets shows that Betty can significantly mitigate the memory bottleneck, enabling scalable GNN training with much deeper aggregation depths, larger sampling rate, larger training batch sizes, together with more advanced aggregators, with a few as a single GPU.
AB - The Graph Neural Network (GNN) is showing outstanding results in improving the performance of graph-based applications. Recent studies demonstrate that GNN performance can be boosted via using more advanced aggregators, deeper aggregation depth, larger sampling rate, etc. While leading to promising results, the improvements come at a cost of significantly increased memory footprint, easily exceeding GPU memory capacity. In this paper, we introduce a method, Betty, to make GNN training more scalable and accessible via batch-level partitioning. Different from DNN training, a mini-batch in GNN has complex dependencies between input features and output labels, making batch-level partitioning difficult. Betty introduces two noveltechniques, redundancy-embedded graph (REG) partitioning and memory-aware partitioning, to effectively mitigate the redundancy and load imbalances issues across the partitions. Our evaluation of large-scale real-world datasets shows that Betty can significantly mitigate the memory bottleneck, enabling scalable GNN training with much deeper aggregation depths, larger sampling rate, larger training batch sizes, together with more advanced aggregators, with a few as a single GPU.
KW - Efficient training method
KW - Graph neural network
KW - Graph partition
KW - Heterogeneous memory
KW - Load balancing
KW - Redundancy elimination
UR - http://www.scopus.com/inward/record.url?scp=85147728578&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147728578&partnerID=8YFLogxK
U2 - 10.1145/3575693.3575725
DO - 10.1145/3575693.3575725
M3 - Conference contribution
AN - SCOPUS:85147728578
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 103
EP - 117
BT - ASPLOS 2023 - Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
A2 - Aamodt, Tor M.
A2 - Jerger, Natalie Enright
A2 - Swift, Michael
PB - Association for Computing Machinery
Y2 - 25 March 2023 through 29 March 2023
ER -