Context. To take full advantage of upcoming large-scale spectroscopic surveys, it will be necessary to parameterize millions of stellar spectra in an efficient way. Machine learning methods, especially convolutional neural networks (CNNs), will be among the main tools geared at achieving this task. Aims. We aim to prepare the groundwork for machine learning techniques for the next generation of spectroscopic surveys, such as 4MOST and WEAVE. Our goal is to show that CNNs can predict accurate stellar labels from relevant spectral features in a physically meaningful way. The predicted labels can be used to investigate properties of the Milky Way galaxy. Methods. We built a neural network and trained it on GIRAFFE spectra with their associated stellar labels from the sixth internal Gaia-ESO data release. Our network architecture contains several convolutional layers that allow the network to identify absorption features in the input spectra. The internal uncertainty was estimated from multiple network models. We used the t-distributed stochastic neighbor embedding tool to remove bad spectra from our training sample. Results. Our neural network is able to predict the atmospheric parameters Teff and log(g) as well as the chemical abundances [Mg/Fe], [Al/Fe], and [Fe/H] for 36 904 stellar spectra. The training precision is 37 K for Teff, 0.06 dex for log(g), 0.05 dex for [Mg/Fe], 0.08 dex for [Al/Fe], and 0.04 dex for [Fe/H]. Network gradients reveal that the network is inferring the labels in a physically meaningful way from spectral features. We validated our methodology using benchmark stars and recovered the properties of different stellar populations in the Milky Way galaxy. Conclusions. Such a study provides very good insights into the application of machine learning for the analysis of large-scale spectroscopic surveys, such as WEAVE and 4MOST Milky Way disk and bulge low- and high-resolution (4MIDABLE-LR and -HR). The community will have to put substantial efforts into building proactive training sets for machine learning methods to minimize any possible systematics.
Read full abstract