Incremental clustering and dynamic information retrieval

Moses Charikar, Chandra Chekuri, Tomas Feder, Rajeev Motwani

Research output: Contribution to journalConference articlepeer-review

Abstract

Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a probably good performance. We complement our positive results with lower bounds on the performance of incremental algorithms. Finally, we consider the dual clustering problem where the clusters are of fixed diameter, and the goal is to minimize the number of clusters.

Original languageEnglish (US)
Pages (from-to)626-635
Number of pages10
JournalConference Proceedings of the Annual ACM Symposium on Theory of Computing
DOIs
StatePublished - 1997
Externally publishedYes
EventProceedings of the 1997 29th Annual ACM Symposium on Theory of Computing - El Paso, TX, USA
Duration: May 4 1997May 6 1997

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Incremental clustering and dynamic information retrieval'. Together they form a unique fingerprint.

Cite this