TY - JOUR
T1 - Generalizing Lloyd’s Algorithm for Graph Clustering
AU - Zaman, Tareq
AU - Nytko, Nicolas
AU - Taghibakhshi, Ali
AU - Maclachlan, Scott
AU - Olson, Luke
AU - West, Matthew
N1 - The work of the fourth author was partially supported by an NSERC Discovery Grant. This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under award DE-NA0003963.
\\ast Submitted to the journal's Numerical Algorithms for Scientific Computing section March 3, 2023; accepted for publication (in revised form) April 15, 2024; published electronically September 3, 2024. https://doi.org/10.1137/23M1556800 Funding: The work of the fourth author was partially supported by an NSERC Discovery Grant. This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under award DE-NA0003963. \\dagger Interdisciplinary Program in Scientific Computing, Memorial University of Newfoundland, St. John's, NL A1C 5S7, Canada ([email protected], [email protected]). \\ddagger Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801 USA ([email protected], [email protected]). \\S Department of Mechanical Science and Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801 USA ([email protected], [email protected]).
PY - 2024
Y1 - 2024
N2 - Clustering is a commonplace problem in many areas of data science, with applications in biology and bioinformatics, understanding chemical structure, image segmentation, building recommender systems, and many more fields. While there are many different clustering variants (based on given distance or graph structure, probability distributions, or data density), we consider here the problem of clustering nodes in a graph, motivated by the problem of aggregating discrete degrees of freedom in multigrid and domain decomposition methods for solving sparse linear systems. Specifically, we consider the challenge of forming balanced clusters in the graph of a sparse matrix for use in algebraic multigrid, although the algorithm has general applicability. Based on an extension of the Bellman-Ford algorithm, we generalize Lloyd's algorithm for partitioning subsets of \BbbRn to balance the number of nodes in each cluster; this is accompanied by a rebalancing algorithm that reduces the overall energy in the system. The algorithm provides control over the number of clusters and leads to ``well centered"" partitions of the graph. Theoretical results are provided to establish linear complexity and numerical results in the context of algebraic multigrid highlight the benefits of improved clustering.
AB - Clustering is a commonplace problem in many areas of data science, with applications in biology and bioinformatics, understanding chemical structure, image segmentation, building recommender systems, and many more fields. While there are many different clustering variants (based on given distance or graph structure, probability distributions, or data density), we consider here the problem of clustering nodes in a graph, motivated by the problem of aggregating discrete degrees of freedom in multigrid and domain decomposition methods for solving sparse linear systems. Specifically, we consider the challenge of forming balanced clusters in the graph of a sparse matrix for use in algebraic multigrid, although the algorithm has general applicability. Based on an extension of the Bellman-Ford algorithm, we generalize Lloyd's algorithm for partitioning subsets of \BbbRn to balance the number of nodes in each cluster; this is accompanied by a rebalancing algorithm that reduces the overall energy in the system. The algorithm provides control over the number of clusters and leads to ``well centered"" partitions of the graph. Theoretical results are provided to establish linear complexity and numerical results in the context of algebraic multigrid highlight the benefits of improved clustering.
KW - aggregation
KW - clustering
KW - graph partitioning
KW - multigrid
UR - http://www.scopus.com/inward/record.url?scp=85204351348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204351348&partnerID=8YFLogxK
U2 - 10.1137/23M1556800
DO - 10.1137/23M1556800
M3 - Article
AN - SCOPUS:85204351348
SN - 1064-8275
VL - 46
SP - A2819-A2847
JO - SIAM Journal on Scientific Computing
JF - SIAM Journal on Scientific Computing
IS - 5
ER -