BiomedRAG: A retrieval augmented large language model for biomedicine

Mingchen Li, Halil Kilicoglu, Hua Xu, Rui Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Retrieval-augmented generation (RAG) involves a solution by retrieving knowledge from an established database to enhance the performance of large language models (LLM)., these models retrieve information at the sentence or paragraph level, potentially introducing noise and affecting the generation quality. To address these issues, we propose a novel BiomedRAG framework that directly feeds automatically retrieved chunk-based documents into the LLM. Our evaluation of BiomedRAG across four biomedical natural language processing tasks using eight datasets demonstrates that our proposed framework not only improves the performance by 9.95% on average, but also achieves state-of-the-art results, surpassing various baselines by 4.97%. BiomedRAG paves the way for more accurate and adaptable LLM applications in the biomedical domain.

Original languageEnglish (US)
Article number104769
JournalJournal of Biomedical Informatics
Volume162
DOIs
StatePublished - Feb 2025

Keywords

  • Large language model
  • Retrieval-augmented generation

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'BiomedRAG: A retrieval augmented large language model for biomedicine'. Together they form a unique fingerprint.

Cite this