TY - JOUR
T1 - Probabilistic cosmic web classification using fast-generated training data
AU - Buncher, Brandon
AU - Carrasco Kind, Matias
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1746047.
Publisher Copyright:
© 2020 The Author(s).
PY - 2020/10/1
Y1 - 2020/10/1
N2 - We present a novel method of robust probabilistic cosmic web particle classification in three dimensions using a supervised machine learning algorithm. Training data were generated using a simplified ΛCDM toy model with pre-determined algorithms for generating haloes, filaments, and voids. While this framework is not constrained by physical modelling, it can be generated substantially more quickly than an N-body simulation without loss in classification accuracy. For each particle in this data set, measurements were taken of the local density field magnitude and directionality. These measurements were used to train a random forest algorithm, which was used to assign class probabilities to each particle in a ΛCDM, dark matter-only N-body simulation with 2563 particles, as well as on another toy model data set. By comparing the trends in the ROC curves and other statistical metrics of the classes assigned to particles in each data set using different feature sets, we demonstrate that the combination of measurements of the local density field magnitude and directionality enables accurate and consistent classification of halo, filament, and void particles in varied environments. We also show that this combination of training features ensures that the construction of our toy model does not affect classification. The use of a fully supervised algorithm allows greater control over the information deemed important for classification, preventing issues arising from arbitrary hyperparameters and mode collapse in deep learning models. Due to the speed of training data generation, our method is highly scalable, making it particularly suited for classifying large data sets, including observed data.
AB - We present a novel method of robust probabilistic cosmic web particle classification in three dimensions using a supervised machine learning algorithm. Training data were generated using a simplified ΛCDM toy model with pre-determined algorithms for generating haloes, filaments, and voids. While this framework is not constrained by physical modelling, it can be generated substantially more quickly than an N-body simulation without loss in classification accuracy. For each particle in this data set, measurements were taken of the local density field magnitude and directionality. These measurements were used to train a random forest algorithm, which was used to assign class probabilities to each particle in a ΛCDM, dark matter-only N-body simulation with 2563 particles, as well as on another toy model data set. By comparing the trends in the ROC curves and other statistical metrics of the classes assigned to particles in each data set using different feature sets, we demonstrate that the combination of measurements of the local density field magnitude and directionality enables accurate and consistent classification of halo, filament, and void particles in varied environments. We also show that this combination of training features ensures that the construction of our toy model does not affect classification. The use of a fully supervised algorithm allows greater control over the information deemed important for classification, preventing issues arising from arbitrary hyperparameters and mode collapse in deep learning models. Due to the speed of training data generation, our method is highly scalable, making it particularly suited for classifying large data sets, including observed data.
KW - dark matter
KW - galaxies: evolution
KW - galaxies: haloes
KW - large-scale structure of Universe
KW - methods: data analysis
KW - methods: statistical
UR - http://www.scopus.com/inward/record.url?scp=85092432836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092432836&partnerID=8YFLogxK
U2 - 10.1093/mnras/staa2008
DO - 10.1093/mnras/staa2008
M3 - Article
AN - SCOPUS:85092432836
SN - 0035-8711
VL - 497
SP - 5041
EP - 5060
JO - Monthly Notices of the Royal Astronomical Society
JF - Monthly Notices of the Royal Astronomical Society
IS - 4
ER -