4 Background: The study devises a framework integrating machine learning (ML) paradigms with explainable artificial intelligence (XAI) to prognosticate the metastatic trajectory of breast cancer (BC) and delineate critical genomic indicators pertinent to metastasis. Methods: An examination was conducted on 98 initial BC specimens, which included 34 instances evolving into distant metastasis within a quintennial monitoring phase and 44 instances evincing no recurrence for a minimum pentad post-diagnosis. Genomic datasets underwent rigorous biostatistical scrutiny, followed by the implementation of an elastic net algorithm for feature discernment, thereby constraining the scope to a salient subset of genomic markers implicated in BC metastasis. An ensemble of advanced predictive models encompassing Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Adaptive Boosting (AdaBoost) was deployed. Model efficacy was gauged through metrics such as accuracy, F1 score, precision, recall, the Area Under the Receiver Operating Characteristic Curve (AUC), and the Brier score. To elucidate the reasoning behind the ML predictions and to navigate the opacity inherent in such models, a SHapley Additive exPlanations (SHAP) approach was invoked. Results: The predictive acumen of the LightGBM model was superior, evidenced by a striking accuracy of 96% and an AUC of 99.3%. Parallel to biostatistical assessments, SHAP analysis leveraging XAI illuminated that augmented expression levels of specific genes, namely TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (with statistical significance p ≤ 0.05), correlated with an escalated risk of BC metastasis. Conversely, diminished expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p ≤ 0.05) was similarly associated with heightened metastatic susceptibility in BC. Conclusions: The insights garnered from this investigation may catalyze preventative strategies against BC progression and metastatic dissemination, thereby enhancing therapeutic efficacy through personalized intervention modalities for BC patients.
Read full abstract