Discriminative training of clustering functions: Theory and experiments with entity identification

Xin Li, Dan Roth

Research output: Contribution to conferencePaperpeer-review

Abstract

Clustering is an optimization procedure that partitions a set of elements to optimize some criteria, based on a fixed distance metric defined between the elements. Clustering approaches have been widely applied in natural language processing and it has been shown repeatedly that their success depends on defining a good distance metric, one that is appropriate for the task and the clustering algorithm used. This paper develops a framework in which clustering is viewed as a learning task, and proposes a way to train a distance metric that is appropriate for the chosen clustering algorithm in the context of the given task. Experiments in the context of the entity identification problem exhibit significant performance improvements over state-of-the-art clustering approaches developed for this problem.

Original languageEnglish (US)
Pages64-71
Number of pages8
StatePublished - 2005
Externally publishedYes
Event9th Conference on Computational Natural Language Learning, CoNLL 2005 - Ann Arbor, MI, United States
Duration: Jun 29 2005Jun 30 2005

Other

Other9th Conference on Computational Natural Language Learning, CoNLL 2005
Country/TerritoryUnited States
CityAnn Arbor, MI
Period6/29/056/30/05

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Discriminative training of clustering functions: Theory and experiments with entity identification'. Together they form a unique fingerprint.

Cite this