Language and domain independent entity linking with quantified collective validation

Han Wang, Jin Guang Zheng, Xiaogang Ma, Peter Fox, Heng Ji

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Linking named mentions detected in a source document to an existing knowledge base provides disambiguated entity referents for the mentions. This allows better document analysis, knowledge extraction and knowledge base population. Most of the previous research extensively exploited the linguistic features of the source documents in a supervised or semi-supervised way. These systems therefore cannot be easily applied to a new language or domain. In this paper, we present a novel unsupervised algorithm named Quantified Collective Validation that avoids excessive linguistic analysis on the source documents and fully leverages the knowledge base structure for the entity linking task. We show our approach achieves stateof-the-art English entity linking performance and demonstrate successful deployment in a new language (Chinese) and two new domains (Biomedical and Earth Science). Experiment datasets and system demonstration are available at http://tw.rpi.edu/web/doc/hanwang-emnlp-2015 for research purpose.

Original languageEnglish (US)
Title of host publicationConference Proceedings - EMNLP 2015
Subtitle of host publicationConference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics (ACL)
Pages695-704
Number of pages10
ISBN (Electronic)9781941643327
DOIs
StatePublished - 2015
Externally publishedYes
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal
Duration: Sep 17 2015Sep 21 2015

Publication series

NameConference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing

Other

OtherConference on Empirical Methods in Natural Language Processing, EMNLP 2015
Country/TerritoryPortugal
CityLisbon
Period9/17/159/21/15

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Language and domain independent entity linking with quantified collective validation'. Together they form a unique fingerprint.

Cite this