Learning to resolve natural language ambiguities: A unified approach

Dan Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be recast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, when searching for its hypothesis. Nevertheless, as we show, it searches a space that is as rich as the space of all linear separators. We use this to build an argument for a data driven approach which merely searches for a good linear separator in the feature space, without further assumptions on the domain or a specific problem. We present such an approach - a sparse network of linear separators, utilizing the Winnow learning algorithm - and show how to use it in a variety of ambiguity resolution problems. The learning approach presented is attribute-efficient and, therefore, appropriate for domains having very large number of attributes. In particular, we present an extensive experimental comparison of our approach with other methods on several well studied lexical disambiguation tasks such as context-sensitive spelling correction, prepositional phrase attachment and part of speech tagging. In all cases we show that our approach either outperforms other methods tried for these tasks or performs comparably to the best.

Original languageEnglish (US)
Title of host publicationProceedings of the National Conference on Artificial Intelligence
Editors Anon
PublisherAAAI
Pages806-813
Number of pages8
StatePublished - 1998
EventProceedings of the 1998 15th National Conference on Artificial Intelligence, AAAI - Madison, WI, USA
Duration: Jul 26 1998Jul 30 1998

Other

OtherProceedings of the 1998 15th National Conference on Artificial Intelligence, AAAI
CityMadison, WI, USA
Period7/26/987/30/98

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Learning to resolve natural language ambiguities: A unified approach'. Together they form a unique fingerprint.

Cite this