A survey of text clustering algorithms

Charu C. Aggarwal, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Clustering is a widely studied data mining problem in the text domains. The problem finds numerous applications in customer segmentation, classification, collaborative filtering, visualization, document organization, and indexing. In this chapter, we will provide a detailed survey of the problem of text clustering. We will study the key challenges of the clustering problem, as it applies to the text domain. We will discuss the key methods used for text clustering, and their relative advantages. We will also discuss a number of recent advances in the area in the context of social network and linked data.

Original languageEnglish (US)
Title of host publicationMining Text Data
PublisherSpringer US
Pages77-128
Number of pages52
Volume9781461432234
ISBN (Electronic)9781461432234
ISBN (Print)1461432227, 9781461432227
DOIs
StatePublished - Aug 1 2012

Keywords

  • Text clustering

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'A survey of text clustering algorithms'. Together they form a unique fingerprint.

  • Cite this

    Aggarwal, C. C., & Zhai, C. X. (2012). A survey of text clustering algorithms. In Mining Text Data (Vol. 9781461432234, pp. 77-128). Springer US. https://doi.org/10.1007/978-1-4614-3223-4_4