ABSTRACT Although red clump (RC) stars are easy to identify due to their stability of luminosity and colour, about 20–50 per cent are actually red giant branch (RGB) stars in the same location on the HR diagram. In this paper, a sample of 210 504 spectra for 184 318 primary RC (PRC) stars from the LAMOST DR7 is identified, which has a purity of higher than 90 per cent. The RC and the RGB stars are successfully distinguished through LAMOST spectra (R ∼ 1800 and signal-to-noise ratio >10) by adopting the XGBoost ensemble learning algorithm, and the secondary RC stars are also removed. The SHapley Additive exPlanations (SHAP) value is used to explain the top features that the XGBoost model selected. The features are around Fe5270, MgH & Mg Ib, Fe4957, Fe4207, Cr5208, and CN, which can successfully distinguish RGB and RC stars. The XGBoost is also used to estimate the ages and masses of PRC stars by training their spectra with Kepler labelled asteroseismic parameters. The uncertainties of mass and age are 13 and 31 per cent, respectively. Verifying the feature attribution model, we find that the age-sensitive element XGBoost is consistent with the literature. Distance of the PRC stars is derived by KS absolute magnitude calibrated by Gaia EDR3, which has an uncertainty of about 6 per cent and shows the stars mainly located at the Galactic disc. We also test the XGBoost with R ∼ 250, which is the resolution of the Chinese Space Station Telescope under construction; it is still capable of finding sensitive features to distinguish RC and RGB.
Read full abstract