TY - GEN
T1 - Learning to rank from distant supervision
T2 - 29th International Conference on Data Engineering, ICDE 2013
AU - Zhou, Mianwei
AU - Wang, Hongning
AU - Change, Kevin Chen Chuan
PY - 2013
Y1 - 2013
N2 - In this paper, we study the task of relational entity search which aims at automatically learning an entity ranking function for a desired relation. To rank entities, we exploit the redundancy abound in their snippets; however, such redundancy is noisy as not all the snippets represent information relevant to the desired relation. To explore useful information from such noisy redundancy, we abstract the task as a distantly supervised ranking problem - based on coarse entity-level annotations, deriving a relation-specific ranking function for the purpose of online searching. As the key challenge, without detailed snippet-level annotations, we have to learn an entity ranking function that can effectively filter noise; furthermore, the ranking function should also be online executable. We develop Pattern-based Filter Network (PFNet), a novel probabilistic graphical model, as our solution. To balance the accuracy and efficiency requirements, PFNet selects a limited size of indicative patterns to filter noisy snippets, and inverted indexes are utilized to retrieve required features. Experiments on the large scale CuleWeb09 data set for six different relations confirm the effectiveness of the proposed PFNet model, which outperforms five state-of-the-art relational entity ranking methods.
AB - In this paper, we study the task of relational entity search which aims at automatically learning an entity ranking function for a desired relation. To rank entities, we exploit the redundancy abound in their snippets; however, such redundancy is noisy as not all the snippets represent information relevant to the desired relation. To explore useful information from such noisy redundancy, we abstract the task as a distantly supervised ranking problem - based on coarse entity-level annotations, deriving a relation-specific ranking function for the purpose of online searching. As the key challenge, without detailed snippet-level annotations, we have to learn an entity ranking function that can effectively filter noise; furthermore, the ranking function should also be online executable. We develop Pattern-based Filter Network (PFNet), a novel probabilistic graphical model, as our solution. To balance the accuracy and efficiency requirements, PFNet selects a limited size of indicative patterns to filter noisy snippets, and inverted indexes are utilized to retrieve required features. Experiments on the large scale CuleWeb09 data set for six different relations confirm the effectiveness of the proposed PFNet model, which outperforms five state-of-the-art relational entity ranking methods.
UR - http://www.scopus.com/inward/record.url?scp=84881323542&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881323542&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2013.6544878
DO - 10.1109/ICDE.2013.6544878
M3 - Conference contribution
AN - SCOPUS:84881323542
SN - 9781467349086
T3 - Proceedings - International Conference on Data Engineering
SP - 829
EP - 840
BT - ICDE 2013 - 29th International Conference on Data Engineering
Y2 - 8 April 2013 through 11 April 2013
ER -