TY - GEN
T1 - Language and domain independent entity linking with quantified collective validation
AU - Wang, Han
AU - Zheng, Jin Guang
AU - Ma, Xiaogang
AU - Fox, Peter
AU - Ji, Heng
N1 - Publisher Copyright:
© 2015 Association for Computational Linguistics.
PY - 2015
Y1 - 2015
N2 - Linking named mentions detected in a source document to an existing knowledge base provides disambiguated entity referents for the mentions. This allows better document analysis, knowledge extraction and knowledge base population. Most of the previous research extensively exploited the linguistic features of the source documents in a supervised or semi-supervised way. These systems therefore cannot be easily applied to a new language or domain. In this paper, we present a novel unsupervised algorithm named Quantified Collective Validation that avoids excessive linguistic analysis on the source documents and fully leverages the knowledge base structure for the entity linking task. We show our approach achieves stateof-the-art English entity linking performance and demonstrate successful deployment in a new language (Chinese) and two new domains (Biomedical and Earth Science). Experiment datasets and system demonstration are available at http://tw.rpi.edu/web/doc/hanwang-emnlp-2015 for research purpose.
AB - Linking named mentions detected in a source document to an existing knowledge base provides disambiguated entity referents for the mentions. This allows better document analysis, knowledge extraction and knowledge base population. Most of the previous research extensively exploited the linguistic features of the source documents in a supervised or semi-supervised way. These systems therefore cannot be easily applied to a new language or domain. In this paper, we present a novel unsupervised algorithm named Quantified Collective Validation that avoids excessive linguistic analysis on the source documents and fully leverages the knowledge base structure for the entity linking task. We show our approach achieves stateof-the-art English entity linking performance and demonstrate successful deployment in a new language (Chinese) and two new domains (Biomedical and Earth Science). Experiment datasets and system demonstration are available at http://tw.rpi.edu/web/doc/hanwang-emnlp-2015 for research purpose.
UR - http://www.scopus.com/inward/record.url?scp=84959932002&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959932002&partnerID=8YFLogxK
U2 - 10.18653/v1/d15-1081
DO - 10.18653/v1/d15-1081
M3 - Conference contribution
AN - SCOPUS:84959932002
T3 - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing
SP - 695
EP - 704
BT - Conference Proceedings - EMNLP 2015
PB - Association for Computational Linguistics (ACL)
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2015
Y2 - 17 September 2015 through 21 September 2015
ER -