Towards overcoming data scarcity in materials science: unifying models and datasets with a mixture of experts framework

Research output: Contribution to journalArticlepeer-review

Abstract

While machine learning has emerged in recent years as a useful tool for the rapid prediction of materials properties, generating sufficient data to reliably train models without overfitting is often impractical. Towards overcoming this limitation, we present a general framework for leveraging complementary information across different models and datasets for accurate prediction of data-scarce materials properties. Our approach, based on a machine learning paradigm called mixture of experts, outperforms pairwise transfer learning on 14 of 19 materials property regression tasks, performing comparably on four of the remaining five. The approach is interpretable, model-agnostic, and scalable to combining an arbitrary number of pre-trained models and datasets to any downstream property prediction task. We anticipate the performance of our framework will further improve as better model architectures, new pre-training tasks, and larger materials datasets are developed by the community.

Original languageEnglish (US)
Article number242
Journalnpj Computational Materials
Volume8
Issue number1
DOIs
StatePublished - Dec 2022

ASJC Scopus subject areas

  • Modeling and Simulation
  • General Materials Science
  • Mechanics of Materials
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Towards overcoming data scarcity in materials science: unifying models and datasets with a mixture of experts framework'. Together they form a unique fingerprint.

Cite this