TY - GEN
T1 - Graphite
T2 - 49th IEEE/ACM International Symposium on Computer Architecture, ISCA 2022
AU - Gong, Zhangxiaowen
AU - Ji, Houxiang
AU - Yao, Yao
AU - Fletcher, Christopher W.
AU - Hughes, Christopher J.
AU - Torrellas, Josep
N1 - This work was supported in part by a gift from Intel Corporation and by NSF grants CCF 1725734, CNS 1909999, CNS 1942888, and CCF 2028861.
PY - 2022/6/18
Y1 - 2022/6/18
N2 - Graph Neural Networks (GNNs) are becoming popular because they are effective at extracting information from graphs. To execute GNNs, CPUs are good platforms because of their high availability and terabyte-level memory capacity, which enables full-batch computation on large graphs. However, GNNs on CPUs are heavily memory bound, which limits their performance. In this paper, we address this problem by alleviating the stress of GNNs on memory with cooperative software-hardware techniques. Our software techniques include: (i) layer fusion that overlaps the memory-intensive phase and the compute-intensive phase in a GNN layer, (ii) feature compression that reduces memory trafc by exploiting the sparsity in the vertex feature vectors, and (iii) an algorithm that changes the processing order of vertices to improve temporal locality. On top of the software techniques, we enhance the CPUs' direct memory access (DMA) engines with the capability to execute the GNNs' memory-intensive phase, so that the processor cores can focus on the compute-intensive phase. We call the combination of our software and hardware techniques Graphite. We evaluate Graphite with popular GNN models on large graphs. The result is high-performance full-batch GNN training and inference on CPUs. Our software techniques outperform a state-of-theart GNN layer implementation by 1.7-1.9x in inference and 1.6-2.6x in training. Our combined software and hardware techniques speedup inference by 1.6-2.0x and training by 1.9-3.1x.
AB - Graph Neural Networks (GNNs) are becoming popular because they are effective at extracting information from graphs. To execute GNNs, CPUs are good platforms because of their high availability and terabyte-level memory capacity, which enables full-batch computation on large graphs. However, GNNs on CPUs are heavily memory bound, which limits their performance. In this paper, we address this problem by alleviating the stress of GNNs on memory with cooperative software-hardware techniques. Our software techniques include: (i) layer fusion that overlaps the memory-intensive phase and the compute-intensive phase in a GNN layer, (ii) feature compression that reduces memory trafc by exploiting the sparsity in the vertex feature vectors, and (iii) an algorithm that changes the processing order of vertices to improve temporal locality. On top of the software techniques, we enhance the CPUs' direct memory access (DMA) engines with the capability to execute the GNNs' memory-intensive phase, so that the processor cores can focus on the compute-intensive phase. We call the combination of our software and hardware techniques Graphite. We evaluate Graphite with popular GNN models on large graphs. The result is high-performance full-batch GNN training and inference on CPUs. Our software techniques outperform a state-of-theart GNN layer implementation by 1.7-1.9x in inference and 1.6-2.6x in training. Our combined software and hardware techniques speedup inference by 1.6-2.0x and training by 1.9-3.1x.
KW - CPU
KW - DMA
KW - Graph Neural Networks
KW - Hardware-software co-design
UR - http://www.scopus.com/inward/record.url?scp=85132829415&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132829415&partnerID=8YFLogxK
U2 - 10.1145/3470496.3527403
DO - 10.1145/3470496.3527403
M3 - Conference contribution
AN - SCOPUS:85132829415
T3 - Proceedings - International Symposium on Computer Architecture
SP - 916
EP - 931
BT - ISCA 2022 - Proceedings of the 49th Annual International Symposium on Computer Architecture
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 June 2022 through 22 June 2022
ER -