Abstract We employ the XGBoost machine learning (ML) method for the morphological classification of galaxies into two (early-type, late-type) and five (E, S0–S0a, Sa–Sb, Sbc–Scd, Sd–Irr) classes, using a combination of non-parametric (C, A, S, AS, Gini, M20, c5090), parametric (Sérsic index, n), geometric (axial ratio, BA), global colour (g − i, u − r, u − i), colour gradient (Δ(g − i)), and asymmetry gradient (ΔA9050) information, all estimated for a local galaxy sample (z < 0.15) compiled from the Sloan Digital Sky Survey (SDSS) imaging data. We train the XGBoost model and evaluate its performance through multiple standard metrics. Our findings reveal better performance when utilizing all fourteen parameters, achieving accuracies of 88% and 65% for the two-class and five-class classification tasks, respectively. In addition, we investigate a hierarchical classification approach for the five-class scenario, combining three XGBoost classifiers. We observe comparable performance to the “direct” five-class classification, with discrepancies of only up to 3%. Using SHAP (an advanced interpretation tool), we analyse how galaxy parameters impact the model’s classifications, providing valuable insights into the influence of these features on classification outcomes. Finally, we compare our results with previous studies and find them consistently aligned.
Read full abstract