We present a new exploratory framework to model galaxy formation and evolution in a hierarchical Universe by using machine learning (ML). Our motivations are two-fold: (1) presenting a new, promising technique to study galaxy formation, and (2) quantitatively analysing the extent of the influence of dark matter halo properties on galaxies in the backdrop of semi-analytical models (SAMs). We use the influential Millennium Simulation and the corresponding Munich SAM to train and test various sophisticated ML algorithms (k-Nearest Neighbors, decision trees, random forests, and extremely randomized trees). By using only essential dark matter halo physical properties for haloes of M > 1012M⊙ and a partial merger tree, our model predicts the hot gas mass, cold gas mass, bulge mass, total stellar mass, black hole mass and cooling radius at z = 0 for each central galaxy in a dark matter halo for the Millennium run. Our results provide a unique and powerful phenomenological framework to explore the galaxy-halo connection that is built upon SAMs and demonstrably place ML as a promising and a computationally efficient tool to study small-scale structure formation.

Original languageEnglish (US)
Pages (from-to)642-658
Number of pages17
JournalMonthly Notices of the Royal Astronomical Society
Issue number1
StatePublished - Jan 1 2016


  • Cosmology
  • Evolution
  • Formation
  • Galaxies
  • Galaxies
  • Galaxies
  • Haloes
  • Large-scale structure of Universe
  • Theory

ASJC Scopus subject areas

  • Astronomy and Astrophysics
  • Space and Planetary Science


Dive into the research topics of 'Machine learning and cosmological simulations - I. Semi-analytical models'. Together they form a unique fingerprint.

Cite this