Text Chunking based on a Generalization of Winnow

Tong Zhang, Fred Damerau, David Johnson

Research output: Contribution to journalArticlepeer-review

Abstract

This paper describes a text chunking system based on a generalization of the Winnow algorithm. We propose a general statistical model for text chunking which we then convert into a classification problem. We argue that the Winnow family of algorithms is particularly suitable for solving classification problems arising from NLP applications, due to their robustness to irrelevant features. However in theory, Winnow may not converge for linearly non-separable data. To remedy this problem, we employ a generalization of the original Winnow method. An additional advantage of the new algorithm is that it provides reliable confidence estimates for its classification predictions. This property is required in our statistical modeling approach. We show that our system achieves state of the art performance in text chunking with less computational cost then previous systems.

Original languageEnglish (US)
Pages (from-to)615-637
Number of pages23
JournalJournal of Machine Learning Research
Volume2
Issue number4
DOIs
StatePublished - 2002
Externally publishedYes

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Text Chunking based on a Generalization of Winnow'. Together they form a unique fingerprint.

Cite this