Taxonomy-Guided Fine-Grained Entity Set Expansion

Jinfeng Xiao, Mohab Elkaref, Nathan Herr, Geeth De Mel, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Entity set expansion, the task of expanding a small set of similar entities into a much larger set, is a vital step for downstream tasks such as named entity recognition, knowledge base construction and information retrieval. Existing entity set expansion methods were developed by mainly considering entities at coarse-grained levels, which encounter difficulties for entity set expansion at fine-grained levels, due to the subtlety on fine-grained type inference and semantic drifting. In this study, we propose an automated (i.e. without human annotation), fine-grained set expansion framework, FGExpan, which utilizes a taxonomy structure and a pre-trained language model to achieve high performance. To facilitate our testing, a new fine-grained set expansion dataset is also constructed. Experiments on this dataset and those used in previous studies show that FGExpan achieves significantly better performance (MAP up by 0.176) on fine-grained types and also the state-of-the-art expansion quality on coarse-grained entity sets.

Original languageEnglish (US)
Title of host publication2023 SIAM International Conference on Data Mining, SDM 2023
PublisherSociety for Industrial and Applied Mathematics Publications
Pages631-639
Number of pages9
ISBN (Electronic)9781611977653
StatePublished - 2023
Event2023 SIAM International Conference on Data Mining, SDM 2023 - Minneapolis, United States
Duration: Apr 27 2023Apr 29 2023

Publication series

Name2023 SIAM International Conference on Data Mining, SDM 2023

Conference

Conference2023 SIAM International Conference on Data Mining, SDM 2023
Country/TerritoryUnited States
CityMinneapolis
Period4/27/234/29/23

ASJC Scopus subject areas

  • Education
  • Information Systems

Fingerprint

Dive into the research topics of 'Taxonomy-Guided Fine-Grained Entity Set Expansion'. Together they form a unique fingerprint.

Cite this