Data clustering and graph partitioning via simulated mixing

Shahzad Bhatti, Carolyn Beck, Angelia Nedich

Research output: Contribution to journalArticlepeer-review

Abstract

Spectral clustering approaches have led to well-accepted algorithms for finding accurate clusters in a given dataset. However, their application to large-scale datasets has been hindered by the computational complexity of eigenvalue decompositions. Several algorithms have been proposed in the recent past to accelerate spectral clustering, however they compromise on the accuracy of the spectral clustering to achieve faster speed. In this paper, we propose a novel spectral clustering algorithm based on a mixing process on a graph. Unlike the existing spectral clustering algorithms, our algorithm does not require computing eigenvectors. Specifically, it finds the equivalent of a linear combination of eigenvectors of the normalized similarity matrix weighted with corresponding eigenvalues. This linear combination is then used to partition the dataset into meaningful clusters. Simulations on real datasets show that partitioning datasets based on such linear combinations of eigenvectors achieves better accuracy than standard spectral clustering methods as the number of clusters increase. Our algorithm can easily be implemented in a distributed setting.

Original languageEnglish (US)
Article number8331142
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Network Science and Engineering
Volume6
Issue number3
DOIs
StatePublished - Apr 3 2019

Keywords

  • Clustering algorithms
  • Community detection
  • Graph partitioning
  • Random walk

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Data clustering and graph partitioning via simulated mixing'. Together they form a unique fingerprint.

Cite this