Parallel K-clique counting on GPUs

Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, Wen Mei Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Counting k-cliques in a graph is an important problem in graph analysis with many applications such as community detection and graph partitioning. Counting k-cliques is typically done by traversing search trees starting at each vertex in the graph. Parallelizing k-clique counting has been well-studied on CPUs and many solutions exist. However, there are no performant solutions for k-clique counting on GPUs. Parallelizing k-clique counting on GPUs comes with numerous challenges such as the need for extracting fine-grain multi-level parallelism, sensitivity to load imbalance, and constrained physical memory capacity. While there has been work on related problems such as finding maximal cliques and generalized sub-graph matching on GPUs, k-clique counting in particular has yet to be explored in depth. In this paper, we present the first parallel GPU solution specialized for the k-clique counting problem. Our solution supports both graph orientation and pivoting for eliminating redundant clique discovery. It incorporates both vertex-centric and edge-centric parallelization schemes for distributing work across thread blocks, and further partitions work within each thread block to extract fine-grain multi-level parallelism while tolerating load imbalance. It also includes optimizations such as binary encoding of induced sub-graphs and sub-warp partitioning to limit memory consumption and improve the utilization of execution resources. Our evaluation shows that our best GPU implementation outperforms the best state-of-the-art parallel CPU implementation by a geometric mean of 12.39X, 6.21X, and 18.99X for k = 4, 7, and 10, respectively. We also perform a detailed evaluation of the trade-offs involved in the choice of parallelization scheme, and the incremental speedup of each optimization to provide an in-depth understanding of the optimization space. The insights from our optimization flow can be useful for optimizing other clique finding and graph mining solutions on GPUs. Our code will be open-sourced to enable further research on GPU parallelization of k-clique counting and other similar graph mining algorithms.

Original languageEnglish (US)
Title of host publicationProceedings of the 36th ACM International Conference on Supercomputing, ICS 2022
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450392815
DOIs
StatePublished - Jun 28 2022
Event36th ACM International Conference on Supercomputing, ICS 2022 - Virtual, Online
Duration: Jun 27 2022Jun 30 2022

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference36th ACM International Conference on Supercomputing, ICS 2022
CityVirtual, Online
Period6/27/226/30/22

Keywords

  • GPU
  • Graphs
  • K-clique counting
  • Parallel search tree traversal

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Parallel K-clique counting on GPUs'. Together they form a unique fingerprint.

Cite this