TY - GEN
T1 - TaxoEnrich
T2 - 31st ACM World Wide Web Conference, WWW 2022
AU - Jiang, Minhao
AU - Song, Xiangchen
AU - Zhang, Jieyu
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/4/25
Y1 - 2022/4/25
N2 - Taxonomies are fundamental to many real-world applications in various domains, serving as structural representations of knowledge. To deal with the increasing volume of new concepts needed to be organized as taxonomies, researchers turn to automatically completion of an existing taxonomy with new concepts. In this paper, we propose TaxoEnrich, a new taxonomy completion framework, which effectively leverages both semantic features and structural information in the existing taxonomy and offers a better representation of candidate position to boost the performance of taxonomy completion. Specifically, TaxoEnrichconsists of four components: (1) taxonomy-contextualized embedding which incorporates both semantic meanings of concept and taxonomic relations based on powerful pretrained language models; (2) a taxonomy-aware sequential encoder which learns candidate position representations by encoding the structural information of taxonomy; (3) a query-aware sibling encoder which adaptively aggregates candidate siblings to augment candidate position representations based on their importance to the query-position matching; (4) a query-position matching model which extends existing work with our new candidate position representations. Extensive experiments on four large real-world datasets from different domains show that TaxoEnrichachieves the best performance among all evaluation metrics and outperforms previous state-of-the-art methods by a large margin.
AB - Taxonomies are fundamental to many real-world applications in various domains, serving as structural representations of knowledge. To deal with the increasing volume of new concepts needed to be organized as taxonomies, researchers turn to automatically completion of an existing taxonomy with new concepts. In this paper, we propose TaxoEnrich, a new taxonomy completion framework, which effectively leverages both semantic features and structural information in the existing taxonomy and offers a better representation of candidate position to boost the performance of taxonomy completion. Specifically, TaxoEnrichconsists of four components: (1) taxonomy-contextualized embedding which incorporates both semantic meanings of concept and taxonomic relations based on powerful pretrained language models; (2) a taxonomy-aware sequential encoder which learns candidate position representations by encoding the structural information of taxonomy; (3) a query-aware sibling encoder which adaptively aggregates candidate siblings to augment candidate position representations based on their importance to the query-position matching; (4) a query-position matching model which extends existing work with our new candidate position representations. Extensive experiments on four large real-world datasets from different domains show that TaxoEnrichachieves the best performance among all evaluation metrics and outperforms previous state-of-the-art methods by a large margin.
KW - Knowledge Representation
KW - Self-supervised Learning
KW - Taxonomy Completion
UR - http://www.scopus.com/inward/record.url?scp=85129813592&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129813592&partnerID=8YFLogxK
U2 - 10.1145/3485447.3511935
DO - 10.1145/3485447.3511935
M3 - Conference contribution
AN - SCOPUS:85129813592
T3 - WWW 2022 - Proceedings of the ACM Web Conference 2022
SP - 925
EP - 934
BT - WWW 2022 - Proceedings of the ACM Web Conference 2022
PB - Association for Computing Machinery, Inc
Y2 - 25 April 2022 through 29 April 2022
ER -