Model-based overlapping clustering

Arindam Banerjee, Chase Krumpelman, Joydeep Ghosh, Sugato Basu, Raymond J. Mooney

Research output: Contribution to conferencePaperpeer-review


While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters have come from work on analysis of biological datasets. In this paper, we interpret an overlapping clustering model proposed by Segal et al. [23] as a generalization of Gaussian mixture models, and we extend it to an overlapping clustering model based on mixtures of any regular exponential family distribution and the corresponding Bregman divergence. We provide the necessary algorithm modifications for this extension, and present results on synthetic data as well as subsets of 20-Newsgroups and EachMovie datasets.

Original languageEnglish (US)
Number of pages6
StatePublished - 2005
Externally publishedYes
EventKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States
Duration: Aug 21 2005Aug 24 2005


OtherKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/TerritoryUnited States
CityChicago, IL


  • Bregman divergences
  • Exponential model
  • Graphical model
  • High-dimensional clustering
  • Overlapping clustering

ASJC Scopus subject areas

  • Software
  • Information Systems


Dive into the research topics of 'Model-based overlapping clustering'. Together they form a unique fingerprint.

Cite this