TY - GEN
T1 - Blind men and the elephant
T2 - 16th IEEE International Conference on Data Mining, ICDM 2016
AU - Wang, Xiaolong
AU - Wang, Jingjing
AU - Jie, Luo
AU - Zhai, Chengxiang
AU - Chang, Yi
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/2
Y1 - 2016/7/2
N2 - Crowdsourcing services make it possible to collect huge amount of annotations from less trained crowd workers in an inexpensive and efficient manner. However, unlike making binary or pairwise judgements, labeling complex structures such as ranked lists by crowd workers is subject to large variance and low efficiency, mainly due to the huge labeling space and the annotators' non-expert nature. Yet ranked lists offer the most informative knowledge for training and testing in various data mining and information retrieval tasks such as learning to rank. In this paper, we propose a novel generative model called 'Thurstonian Pairwise Preference' (TPP) to infer the true ranked list out of a collection of crowdsourced pairwise annotations. The key challenges that TPP addresses are to resolve the inevitable incompleteness and inconsistency of judgements, as well as to model variable query difficulty and different labeling quality resulting from workers' domain expertise and truthfulness. Experimental results on both synthetic and real-world datasets demonstrate that TPP can effectively bind pairwise preferences of the crowd into rankings and substantially outperforms previously published methods.
AB - Crowdsourcing services make it possible to collect huge amount of annotations from less trained crowd workers in an inexpensive and efficient manner. However, unlike making binary or pairwise judgements, labeling complex structures such as ranked lists by crowd workers is subject to large variance and low efficiency, mainly due to the huge labeling space and the annotators' non-expert nature. Yet ranked lists offer the most informative knowledge for training and testing in various data mining and information retrieval tasks such as learning to rank. In this paper, we propose a novel generative model called 'Thurstonian Pairwise Preference' (TPP) to infer the true ranked list out of a collection of crowdsourced pairwise annotations. The key challenges that TPP addresses are to resolve the inevitable incompleteness and inconsistency of judgements, as well as to model variable query difficulty and different labeling quality resulting from workers' domain expertise and truthfulness. Experimental results on both synthetic and real-world datasets demonstrate that TPP can effectively bind pairwise preferences of the crowd into rankings and substantially outperforms previously published methods.
KW - Crowdsourcing
KW - Ranking
KW - Thurstonian Pairwise Preference
UR - http://www.scopus.com/inward/record.url?scp=85014547234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85014547234&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2016.159
DO - 10.1109/ICDM.2016.159
M3 - Conference contribution
AN - SCOPUS:85014547234
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 509
EP - 518
BT - Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016
A2 - Bonchi, Francesco
A2 - Domingo-Ferrer, Josep
A2 - Baeza-Yates, Ricardo
A2 - Zhou, Zhi-Hua
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 December 2016 through 15 December 2016
ER -