Chemical entity normalization for successful translational development of Alzheimer’s disease and dementia therapeutics

Sarah Mullin, Robert McDougal, Kei Hoi Cheung, Halil Kilicoglu, Amanda Beck, Caroline J. Zeiss

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Identifying chemical mentions within the Alzheimer’s and dementia literature can provide a powerful tool to further therapeutic research. Leveraging the Chemical Entities of Biological Interest (ChEBI) ontology, which is rich in hierarchical and other relationship types, for entity normalization can provide an advantage for future downstream applications. We provide a reproducible hybrid approach that combines an ontology-enhanced PubMedBERT model for disambiguation with a dictionary-based method for candidate selection. Results: There were 56,553 chemical mentions in the titles of 44,812 unique PubMed article abstracts. Based on our gold standard, our method of disambiguation improved entity normalization by 25.3 percentage points compared to using only the dictionary-based approach with fuzzy-string matching for disambiguation. For the CRAFT corpus, our method outperformed baselines (maximum 78.4%) with a 91.17% accuracy. For our Alzheimer’s and dementia cohort, we were able to add 47.1% more potential mappings between MeSH and ChEBI when compared to BioPortal. Conclusion: Use of natural language models like PubMedBERT and resources such as ChEBI and PubChem provide a beneficial way to link entity mentions to ontology terms, while further supporting downstream tasks like filtering ChEBI mentions based on roles and assertions to find beneficial therapies for Alzheimer’s and dementia.

Original languageEnglish (US)
Article number13
JournalJournal of Biomedical Semantics
Volume15
Issue number1
DOIs
StatePublished - Dec 2024

Keywords

  • Alzheimer
  • ChEBI
  • Dementia
  • Entity normalization
  • Ontology

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Health Informatics
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Chemical entity normalization for successful translational development of Alzheimer’s disease and dementia therapeutics'. Together they form a unique fingerprint.

Cite this