TY - GEN
T1 - Place deduplication with embeddings
AU - Yang, Carl
AU - Mikolov, Tomas
AU - Hoang, Do Huy
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - Thanks to the advancing mobile location services, people nowadays can post about places to share visiting experience on-the-go. A large place graph not only helps users explore interesting destinations, but also provides opportunities for understanding and modeling the real world. To improve coverage and flexibility of the place graph, many platforms import places data from multiple sources, which unfortunately leads to the emergence of numerous duplicated places that severely hinder subsequent location-related services. In this work, we take the anonymous place graph from Facebook as an example to systematically study the problem of place deduplication: We carefully formulate the problem, study its connections to various related tasks that lead to several promising basic models, and arrive at a systematic two-step data-driven pipeline based on place embedding with multiple novel techniques that works significantly better than the state-of-the-art.
AB - Thanks to the advancing mobile location services, people nowadays can post about places to share visiting experience on-the-go. A large place graph not only helps users explore interesting destinations, but also provides opportunities for understanding and modeling the real world. To improve coverage and flexibility of the place graph, many platforms import places data from multiple sources, which unfortunately leads to the emergence of numerous duplicated places that severely hinder subsequent location-related services. In this work, we take the anonymous place graph from Facebook as an example to systematically study the problem of place deduplication: We carefully formulate the problem, study its connections to various related tasks that lead to several promising basic models, and arrive at a systematic two-step data-driven pipeline based on place embedding with multiple novel techniques that works significantly better than the state-of-the-art.
KW - Feature generation
KW - Metric learning
KW - Place deduplication
UR - http://www.scopus.com/inward/record.url?scp=85066836234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066836234&partnerID=8YFLogxK
U2 - 10.1145/3308558.3313456
DO - 10.1145/3308558.3313456
M3 - Conference contribution
AN - SCOPUS:85066836234
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 3420
EP - 3426
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -