AstroLLaMA: Towards Specialized Foundation Models in Astronomy

  • Tuan Dung Nguyen
  • , Yuan-Sen Ting
  • , Ioana Ciuca
  • , Charles O'Neill
  • , Ze-Chang Sun
  • , Maja Jabłońska
  • , Sandor Kruk
  • , Ernest Perkowski
  • , Jack Miller
  • , Jason Jason Jingsh Li
  • , Josh Peek
  • , Kartheik Iyer
  • , Tomasz Rozanski
  • , Pranav Khetarpal
  • , Sharaf Zaman
  • , David Brodrick
  • , Sergio J. Rodriguez Mendez
  • , Thang Bui
  • , Alyssa Goodman
  • , Alberto Accomazzi
  • Jill Naiman, Jesse Cranney, Kevin Schawinski, Roberta Raileanu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Large language models often excel in many human-language tasks but tend to falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA shows marked domain adaptation by achieving a 30% lower perplexity than LLaMA- 2. Compared to state-of-the-art foundation models, AstroLLaMA generates more insightful and scientifically relevant text completions and embedding extraction despite having significantly fewer parameters. AstroLLaMA serves as a highly domain-specific model with broad fine-tuning potential: Its public release aims to spur astronomy-focused research, including automatic paper summarization, conversational agent development and hypothesis generation.
Original languageEnglish (US)
Title of host publicationProceedings of the Second Workshop on Information Extraction from Scientific Publications
EditorsTirthankar Ghosal, Felix Grezes, Thomas Allen, Kelly Lockhart, Alberto Accomazzi, Sergi Blanco-Cuaresma
Place of PublicationBali, Indonesia
PublisherAssociation for Computational Linguistics
Pages49-55
Number of pages7
DOIs
StatePublished - Nov 1 2023

Fingerprint

Dive into the research topics of 'AstroLLaMA: Towards Specialized Foundation Models in Astronomy'. Together they form a unique fingerprint.

Cite this