Abstract
ABSTRACTWe demonstrate that highly accurate joint redshift–stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the griz bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for 10 699 test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code bagpipes, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under 6 min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed galpro1, a highly intuitive and efficient python package to rapidly generate multivariate PDFs on-the-fly. galpro is documented and available for researchers to use in their cosmology and galaxy evolution studies.
Highlights
The generation of photometric surveys such as the Rubin Observatory Legacy Survey of Space and Time (LSST; LSST Science Collaboration 2009) and Euclid (Laureijs et al 2011) will observe billions of galaxies
We have found that our machine learning (ML)-based method performs better in every aspect compared to a template-fitting method that employs a fairly standard set-up
The emergence of template-fitting methods with the capability of generating multivariate probability distribution functions (PDFs) of redshift and physical properties of galaxies represents a paradigm shift. These PDFs account for potential correlations between different galaxy properties and fully characterize uncertainties associated with point estimates of the quantities
Summary
The generation of photometric surveys such as the Rubin Observatory Legacy Survey of Space and Time (LSST; LSST Science Collaboration 2009) and Euclid (Laureijs et al 2011) will observe billions of galaxies. The sheer amount of data generated will enable studies ranging from the cosmic large-scale structure, to the formation and evolution of galaxies, to be conducted in unprecedented detail; leading to a transformation in our understanding of the Universe. One of the key challenges will be developing algorithms that can quickly and reliably extract physical properties and redshifts of galaxies. A large number of methods exist to estimate redshifts from photometric data (photo-zs) (see Salvato, Ilbert & Hoyle 2019, for a review). They are either physically motivated or data driven
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.