PENNER: Pattern-enhanced Nested Named Entity Recognition in Biomedical Literature

Xuan Wang, Yu Zhang, Qi Li, Cathy H. Wu, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many biomedical entity mentions contain other entity mentions nested inside. Most current named entity recognition (NER) systems deal with only flat entities and ignore such nested entities, which may introduce errors to subsequent tasks such as relation extraction and knowledge base completion. Recently, fully supervised methods are proposed for nested named entity recognition. Despite their success on benchmark datasets, supervised methods rely on human annotation and lead to highly specialized systems that cannot be easily adapted to new entity types. In this study, we propose PENNER, a novel and effective pattern-enhanced nested named entity recognition method that relies on massive corpora plus only very weak supervision. We compare PENNER with a state-of-the-art BioNER system, PubTator, and observe great improvement at recognizing genes, chemicals, diseases and species. PENNER can also accurately extract new types of entities, such as biological process and treatment, that are not annotated by PubTator.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
EditorsHarald Schmidt, David Griol, Haiying Wang, Jan Baumbach, Huiru Zheng, Zoraida Callejas, Xiaohua Hu, Julie Dickerson, Le Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages540-547
Number of pages8
ISBN (Electronic)9781538654880
DOIs
StatePublished - Jan 21 2019
Event2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018 - Madrid, Spain
Duration: Dec 3 2018Dec 6 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018

Conference

Conference2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
Country/TerritorySpain
CityMadrid
Period12/3/1812/6/18

Keywords

  • meta-pattern discovery
  • multi-set expansion
  • nested named entity recognition
  • pattern mining

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'PENNER: Pattern-enhanced Nested Named Entity Recognition in Biomedical Literature'. Together they form a unique fingerprint.

Cite this