Abstract
Graph Neural Networks (GNNs) are a powerful class of machine learning models with applications in recommender systems, drug discovery, social network analysis, and computer vision. One challenge with their implementation is that GNNs often take large-scale graphs as inputs, which imposes significant computational/storage costs in the training and testing phases. In particular, the message passing operations of a GNN require multiplication of the graph adjacency matrix A ∈ Rn×n and the data matrix X ∈ Rn×d, and the O(n2d) time complexity can be prohibitive for large n. Thus, a natural question is whether it is possible to perform the GNN operations in (quasi-)linear time by avoiding the full computation of AX. To study this question, we consider the setting of a regression task on a two-layer Linear Graph Convolutional Network (GCN). We develop an efficient training algorithm based on (1) performing node subsampling, (2) estimating the leverage scores of AX based on the subsampled graph, and (3) performing leverage score sampling on AX. We show that our proposed scheme learns the regression model observing only O(ndε−2 log n) entries of A in time O(nd2ε−2 log n), with the guarantee that the learned weights deviate by at most ε under the ℓ2 norm from the model learned using the entire adjacency matrix A. We present empirical results for regression problems on real-world graphs and show that our algorithm significantly outperforms other baseline sampling strategies that exploit the same number of observations.
Original language | English (US) |
---|---|
Pages (from-to) | 55479-55501 |
Number of pages | 23 |
Journal | Advances in Neural Information Processing Systems |
Volume | 36 |
State | Published - 2023 |
Event | 37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States Duration: Dec 10 2023 → Dec 16 2023 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing