Distantly Supervised Biomedical Named Entity Recognition with Dictionary Expansion

Xuan Wang, Yu Zhang, Qi Li, Xiang Ren, Jingbo Shang, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

State-of-the-art biomedical named entity recognition (BioNER) systems apply supervised machine learning models (i.e., relying on human effort for training data annotation) which are not easy to be generalized to new entity types and datasets. We propose a distantly supervised approach, AutoBioNER, that automatically recognizes biomedical entities from massive corpora with user-input dictionaries. AutoBioNER does not need any human annotated data. It relies on incomplete entity dictionaries to provide seeds for each entity type and performs a novel entity set expansion step for corpus-level new entity recognition and dictionary completion. The expanded dictionaries are used as distant supervision to train a neural model for BioNER. Experimental results show that AutoBioNER achieves the best performance among the methods that only use dictionaries with no additional human effort on BioNER benchmark datasets. It is also demonstrated that the dictionary expansion step plays an important role in the great performances.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditorsIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages496-503
Number of pages8
ISBN (Electronic)9781728118673
DOIs
StatePublished - Nov 2019
Event2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States
Duration: Nov 18 2019Nov 21 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conference

Conference2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
Country/TerritoryUnited States
CitySan Diego
Period11/18/1911/21/19

Keywords

  • biomedical named entity recognition
  • distantly supervised learning
  • entity expansion

ASJC Scopus subject areas

  • Biochemistry
  • Biotechnology
  • Molecular Medicine
  • Modeling and Simulation
  • Health Informatics
  • Pharmacology (medical)
  • Public Health, Environmental and Occupational Health

Fingerprint

Dive into the research topics of 'Distantly Supervised Biomedical Named Entity Recognition with Dictionary Expansion'. Together they form a unique fingerprint.

Cite this