Unsupervised aggregation for classification problems with large numbers of categories

Ivan Titov, Alexandre Klementiev, Kevin Small, Dan Roth

Research output: Contribution to journalConference articlepeer-review

Abstract

Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a decision-maker to combine predictions from various sources. However, supervised data needed to fit an aggregation model is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges. Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types to account for variability of the judge expertise depending on the type. The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.

Original languageEnglish (US)
Pages (from-to)836-843
Number of pages8
JournalJournal of Machine Learning Research
Volume9
StatePublished - Jan 1 2010
Event13th International Conference on Artificial Intelligence and Statistics, AISTATS 2010 - Sardinia, Italy
Duration: May 13 2010May 15 2010

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Unsupervised aggregation for classification problems with large numbers of categories'. Together they form a unique fingerprint.

Cite this