TY - GEN
T1 - AutoKnow
T2 - 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020
AU - Dong, Xin Luna
AU - He, Xiang
AU - Kan, Andrey
AU - Li, Xian
AU - Liang, Yan
AU - Ma, Jun
AU - Xu, Yifan Ethan
AU - Zhang, Chenwei
AU - Zhao, Tong
AU - Blanco Saldana, Gabriel
AU - Deshpande, Saurabh
AU - Michetti Manduca, Alexandre
AU - Ren, Jay
AU - Singh, Surender Pal
AU - Xiao, Fan
AU - Chang, Haw Shiuan
AU - Karamanolakis, Giannis
AU - Mao, Yuning
AU - Wang, Yaqing
AU - Faloutsos, Christos
AU - McCallum, Andrew
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2020 Owner/Author.
PY - 2020/8/23
Y1 - 2020/8/23
N2 - Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.
AB - Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.
KW - attribute importance
KW - data cleaning
KW - data imputation
KW - knowledge graphs
KW - synonym finding
KW - taxonomy enrichment
UR - http://www.scopus.com/inward/record.url?scp=85090409887&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090409887&partnerID=8YFLogxK
U2 - 10.1145/3394486.3403323
DO - 10.1145/3394486.3403323
M3 - Conference contribution
AN - SCOPUS:85090409887
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 2724
EP - 2734
BT - KDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 23 August 2020 through 27 August 2020
ER -