TY - GEN
T1 - Leveraging fine-grained wikipedia categories for entity search
AU - Ma, Denghao
AU - Chen, Yueguo
AU - Chang, Kevin Chen Chuan
AU - Du, Xiaoyong
AU - Xu, Chuanfei
AU - Chang, Yi
N1 - Yueguo Chen is supported by the National Science Foundation of China under grants No. 61472426, U1711261, 61432006, and the State Visiting Scholar Funds from the China Scholarship Council under Grant Number 201706365018. Denghao Ma is supported by the Outstanding Innovative Talents Cultivation Funded Programs 2017 of Renmin University of China and the State Scholarship Fund from China Scholarship Council under Grant Number 201706360309. Kevin Chang is supported by a gift from Huawei.
PY - 2018/4/10
Y1 - 2018/4/10
N2 - Ad-hoc entity search, which is to retrieve a ranked list of relevant entities in response to a query of natural language question, has been widely studied. It has been shown that category matching of entities, especially when matching to fine-grained entity types/categories, is critical to the performance of entity search. However, the potentials of the fine-grained Wikipedia entity categories, has not been well exploited by existing studies. Based on the observation of how people describe entities of a specific type, we propose a headword-and-modifier model to deeply interpret both queries and fine-grained entity types/categories. Probabilistic generative models are designed to effectively estimate the relevance of headwords and modifiers as a pattern-based matching problem, taking the Wikipedia type taxonomy as an important input to address the ad-hoc representations of concepts/entities in queries. Extensive experimental results on three widely-used test sets: INEX-XER 2009, SemSearch-LS and TREC-Entity, show that our method achieves a significant improvement of the entity search performance over the state-of-the-art methods.
AB - Ad-hoc entity search, which is to retrieve a ranked list of relevant entities in response to a query of natural language question, has been widely studied. It has been shown that category matching of entities, especially when matching to fine-grained entity types/categories, is critical to the performance of entity search. However, the potentials of the fine-grained Wikipedia entity categories, has not been well exploited by existing studies. Based on the observation of how people describe entities of a specific type, we propose a headword-and-modifier model to deeply interpret both queries and fine-grained entity types/categories. Probabilistic generative models are designed to effectively estimate the relevance of headwords and modifiers as a pattern-based matching problem, taking the Wikipedia type taxonomy as an important input to address the ad-hoc representations of concepts/entities in queries. Extensive experimental results on three widely-used test sets: INEX-XER 2009, SemSearch-LS and TREC-Entity, show that our method achieves a significant improvement of the entity search performance over the state-of-the-art methods.
KW - Category matching
KW - Entity search
KW - Language model
UR - https://www.scopus.com/pages/publications/85084074452
UR - https://www.scopus.com/pages/publications/85084074452#tab=citedBy
U2 - 10.1145/3178876.3186074
DO - 10.1145/3178876.3186074
M3 - Conference contribution
AN - SCOPUS:85084074452
T3 - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
SP - 1623
EP - 1632
BT - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery
T2 - 27th International World Wide Web, WWW 2018
Y2 - 23 April 2018 through 27 April 2018
ER -