TY - JOUR
T1 - Deep learning at scale for the construction of galaxy catalogs in the Dark Energy Survey
AU - Khan, Asad
AU - Huerta, E. A.
AU - Wang, Sibo
AU - Gruendl, Robert
AU - Jennings, Elise
AU - Zheng, Huihuo
N1 - Funding Information:
This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993 ) and the State of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. We acknowledge support from the NCSA . We thank the NCSA Gravity Group for useful feedback, and Vlad Kindratenko for granting us access to state-of-the-art GPUs and HPC resources at the Innovative Systems Lab at NCSA. We are grateful to NVIDIA for donating several Tesla P100 and V100 GPUs that we used for our analysis. We thank the anonymous referee for carefully reading this manuscript, and providing constructive feedback to improve the presentation of our results.
Funding Information:
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation (NSF) grant number ACI-1548562 . Specifically, it used the Bridges system, which is supported by National Science Foundation (NSF) award number ACI-1445606 , at the Pittsburgh Supercomputing Center (PSC). We gratefully acknowledge grant TG-PHY160053. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357 .
Funding Information:
This project used public archival data from the Dark Energy Survey (DES). Funding for the DES Projects has been provided by the U.S. Department of Energy, the U.S. National Science Foundation, the Ministry of Science and Education of Spain, the Science and Technology Facilities Council of the United Kingdom, the Higher Education Funding Council for England, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the Kavli Institute of Cosmological Physics at the University of Chicago, the Center for Cosmology and Astro-Particle Physics at the Ohio State University, the Mitchell Institute for Fundamental Physics and Astronomy at Texas A&M University, Financiadora de Estudos e Projetos, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, Conselho Nacional de Desenvolvimento Científico e Tecnológico and the Ministério da Ciência, Tecnologia e Inovação, the Deutsche Forschungsgemeinschaft and the Collaborating Institutions in the Dark Energy Survey.
Funding Information:
The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas–Madrid, the University of Chicago, University College London, the DES-Brazil Consortium, the University of Edinburgh, the Eidgenössische Technische Hochschule (ETH) Zürich, Fermi National Accelerator Laboratory, the University of Illinois at Urbana-Champaign, the Institut de Ciències de l'Espai (IEEC/CSIC), the Institut de Física d'Altes Energies, Lawrence Berkeley National Laboratory, the Ludwig-Maximilians Universität München and the associated Excellence Cluster Universe, the University of Michigan, the National Optical Astronomy Observatory, the University of Nottingham, The Ohio State University, the OzDES Membership Consortium, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, and Texas A&M University. Based in part on observations at Cerro Tololo Inter-American Observatory, National Optical Astronomy Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation.
PY - 2019/8/10
Y1 - 2019/8/10
N2 - The scale of ongoing and future electromagnetic surveys pose formidable challenges to classify astronomical objects. Pioneering efforts on this front include citizen science campaigns adopted by the Sloan Digital Sky Survey (SDSS). SDSS datasets have been recently used to train neural network models to classify galaxies in the Dark Energy Survey (DES) that overlap the footprint of both surveys. Herein, we demonstrate that knowledge from deep learning algorithms, pre-trained with real-object images, can be transferred to classify galaxies that overlap both SDSS and DES surveys, achieving state-of-the-art accuracy ≳99.6%. We demonstrate that this process can be completed within just eight minutes using distributed training. While this represents a significant step towards the classification of DES galaxies that overlap previous surveys, we need to initiate the characterization of unlabelled DES galaxies in new regions of parameter space. To accelerate this program, we use our neural network classifier to label over ten thousand unlabelled DES galaxies, which do not overlap previous surveys. Furthermore, we use our neural network model as a feature extractor for unsupervised clustering and find that unlabelled DES images can be grouped together in two distinct galaxy classes based on their morphology, which provides a heuristic check that the learning is successfully transferred to the classification of unlabelled DES images. We conclude by showing that these newly labelled datasets can be combined with unsupervised recursive training to create large-scale DES galaxy catalogs in preparation for the Large Synoptic Survey Telescope era.
AB - The scale of ongoing and future electromagnetic surveys pose formidable challenges to classify astronomical objects. Pioneering efforts on this front include citizen science campaigns adopted by the Sloan Digital Sky Survey (SDSS). SDSS datasets have been recently used to train neural network models to classify galaxies in the Dark Energy Survey (DES) that overlap the footprint of both surveys. Herein, we demonstrate that knowledge from deep learning algorithms, pre-trained with real-object images, can be transferred to classify galaxies that overlap both SDSS and DES surveys, achieving state-of-the-art accuracy ≳99.6%. We demonstrate that this process can be completed within just eight minutes using distributed training. While this represents a significant step towards the classification of DES galaxies that overlap previous surveys, we need to initiate the characterization of unlabelled DES galaxies in new regions of parameter space. To accelerate this program, we use our neural network classifier to label over ten thousand unlabelled DES galaxies, which do not overlap previous surveys. Furthermore, we use our neural network model as a feature extractor for unsupervised clustering and find that unlabelled DES images can be grouped together in two distinct galaxy classes based on their morphology, which provides a heuristic check that the learning is successfully transferred to the classification of unlabelled DES images. We conclude by showing that these newly labelled datasets can be combined with unsupervised recursive training to create large-scale DES galaxy catalogs in preparation for the Large Synoptic Survey Telescope era.
KW - Convolutional neural networks
KW - Dark Energy Survey
KW - Deep learning
KW - Large Synoptic Survey Telescope
KW - Sloan Digital Sky Survey
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85067829901&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067829901&partnerID=8YFLogxK
U2 - 10.1016/j.physletb.2019.06.009
DO - 10.1016/j.physletb.2019.06.009
M3 - Article
AN - SCOPUS:85067829901
VL - 795
SP - 248
EP - 258
JO - Physics Letters, Section B: Nuclear, Elementary Particle and High-Energy Physics
JF - Physics Letters, Section B: Nuclear, Elementary Particle and High-Energy Physics
SN - 0370-2693
ER -