Abstract
Large language models often excel in many human-language tasks but tend to falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA shows marked domain adaptation by achieving a 30% lower perplexity than LLaMA- 2. Compared to state-of-the-art foundation models, AstroLLaMA generates more insightful and scientifically relevant text completions and embedding extraction despite having significantly fewer parameters. AstroLLaMA serves as a highly domain-specific model with broad fine-tuning potential: Its public release aims to spur astronomy-focused research, including automatic paper summarization, conversational agent development and hypothesis generation.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings of the Second Workshop on Information Extraction from Scientific Publications |
| Editors | Tirthankar Ghosal, Felix Grezes, Thomas Allen, Kelly Lockhart, Alberto Accomazzi, Sergi Blanco-Cuaresma |
| Place of Publication | Bali, Indonesia |
| Publisher | Association for Computational Linguistics |
| Pages | 49-55 |
| Number of pages | 7 |
| DOIs | |
| State | Published - Nov 1 2023 |