Training a geographic entity recognizer on biomedical abstracts with the aid of embeddings, metadata, and linked data

Xiaoliang Jiang, Nigel Bosch, Vetle I. Torvik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Public access to scientific literature has fueled research in text mining and natural language processing, yet the problem of geographic named entity recognition persists. This paper describes a recognizer that uses candidates from multiple existing Named Entity Recognition (NER) tools to ensure high recall and uses a filtering model trained on sentence embeddings, metadata, and citation data to improve precision. Experimental results on a manually curated set of biomedical abstracts show that this filtering model preserves high recall while achieving much higher precision than all of the individual NER tools. This should enable more effective geography-based analysis of scientific literature, for example, to study the role of place in biomedical discovery.

Original languageEnglish (US)
Title of host publicationJCDL 2024 - Proceedings of the 24th ACM/IEEE Joint Conference on Digital Libraries
EditorsJian Wu, Xiao Hu, Terhi Nurmikko-Fuller, Sam Chu, Ruixian Yang, J. Stephen Downie
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798400710933
DOIs
StatePublished - Mar 13 2025
Event24th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2024 - Hong Kong, Hong Kong
Duration: Dec 16 2024Dec 20 2024

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Conference

Conference24th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2024
Country/TerritoryHong Kong
CityHong Kong
Period12/16/2412/20/24

Keywords

  • Biomedical Text Mining
  • Geographic Entity Recognition
  • Geoparsing
  • Information Extraction
  • Linked Data
  • Metadata
  • Named Entity Recognition
  • Natural Language Processing
  • Scholarly Document Processing
  • Sentence Embeddings

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Training a geographic entity recognizer on biomedical abstracts with the aid of embeddings, metadata, and linked data'. Together they form a unique fingerprint.

Cite this