Exploiting the Wikipedia structure in local and global classification of taxonomic relations

Quang Xuan Do, Dan Roth

Research output: Contribution to journalArticle

Abstract

Determining whether two terms have an ancestor relation (e.g. Toyota Camry and car) or a sibling relation (e.g. Toyota and Honda) is an essential component of textual inference in Natural Language Processing applications such as Question Answering, Summarization, and Textual Entailment. Significant work has been done on developing knowledge sources that could support these tasks, but these resources usually suffer from low coverage, noise, and are inflexible when dealing with ambiguous and general terms that may not appear in any stationary resource, making their use as general purpose background knowledge resources difficult. In this paper, rather than building a hierarchical structure of concepts and relations, we describe an algorithmic approach that, given two terms, determines the taxonomic relation between them using a machine learning-based approach that makes use of existing resources. Moreover, we develop a global constraint-based inference process that leverages an existing knowledge base to enforce relational constraints among terms and thus improves the classifier predictions. Our experimental evaluation shows that our approach significantly outperforms other systems built upon the existing well-known knowledge sources.

Original languageEnglish (US)
Pages (from-to)235-262
Number of pages28
JournalNatural Language Engineering
Volume18
Issue number2
DOIs
StatePublished - Apr 1 2012

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Cite this