TY - GEN
T1 - Entity disambiguation with linkless knowledge bases
AU - Li, Yang
AU - Tan, Shulong
AU - Sun, Huan
AU - Han, Jiawei
AU - Roth, Dan
AU - Yan, Xifeng
PY - 2016
Y1 - 2016
N2 - Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain text and distinguish homonymous entities. Previous research has tackled this problem by making use of two types of context-Aware features derived from the reference knowledge base, namely, the context similarity and the semantic relatedness. Both features heavily rely on the cross-document hyperlinks within the knowledge base: The semantic relatedness feature is directly measured via those hyperlinks, while the context similarity feature implicitly makes use of those hyperlinks to expand entity candidates' descriptions and then compares them against the query context. Unfortunately, cross-document hyperlinks are rarely available in many closed domain knowledge bases and it is very expensive to manually add such links. Therefore few algorithms can work well on linkless knowledge bases. In this work, we propose the challenging Named Entity Disambiguation with Linkless Knowledge Bases (LNED) problem and tackle it by leveraging the useful disambiguation evidences scattered across the reference knowledge base. We propose a generative model to automatically mine such evidences out of noisy information. The mined evidences can mimic the role of the missing links and help boost the LNED performance. Experimental results show that our proposed method substantially improves the disambiguation accuracy over the baseline approaches.
AB - Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain text and distinguish homonymous entities. Previous research has tackled this problem by making use of two types of context-Aware features derived from the reference knowledge base, namely, the context similarity and the semantic relatedness. Both features heavily rely on the cross-document hyperlinks within the knowledge base: The semantic relatedness feature is directly measured via those hyperlinks, while the context similarity feature implicitly makes use of those hyperlinks to expand entity candidates' descriptions and then compares them against the query context. Unfortunately, cross-document hyperlinks are rarely available in many closed domain knowledge bases and it is very expensive to manually add such links. Therefore few algorithms can work well on linkless knowledge bases. In this work, we propose the challenging Named Entity Disambiguation with Linkless Knowledge Bases (LNED) problem and tackle it by leveraging the useful disambiguation evidences scattered across the reference knowledge base. We propose a generative model to automatically mine such evidences out of noisy information. The mined evidences can mimic the role of the missing links and help boost the LNED performance. Experimental results show that our proposed method substantially improves the disambiguation accuracy over the baseline approaches.
KW - Entity Disambiguation
KW - Evidence Mining
KW - Generative Model
KW - Linkless Knowledge Bases
UR - http://www.scopus.com/inward/record.url?scp=85003879736&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85003879736&partnerID=8YFLogxK
U2 - 10.1145/2872427.2883068
DO - 10.1145/2872427.2883068
M3 - Conference contribution
AN - SCOPUS:85003879736
T3 - 25th International World Wide Web Conference, WWW 2016
SP - 1261
EP - 1270
BT - 25th International World Wide Web Conference, WWW 2016
PB - International World Wide Web Conferences Steering Committee
T2 - 25th International World Wide Web Conference, WWW 2016
Y2 - 11 April 2016 through 15 April 2016
ER -