Abstract

BackgroundBreast cancer is the most prevalent malignancy in women. Advanced breast cancer can develop distant metastases, posing a severe threat to the life of patients. Because the clinical warning signs of distant metastasis are manifested in the late stage of the disease, there is a need for better methods of predicting metastasis. MethodsFirst, we screened breast cancer distant metastasis target genes by performing difference analysis and weighted gene co-expression network analysis (WGCNA) on the selected datasets, and performed analyses such as GO enrichment analysis on these target genes. Secondly, we screened breast cancer distant metastasis target genes by LASSO regression analysis and performed correlation analysis and other analyses on these biomarkers. Finally, we constructed several breast cancer distant metastasis prediction models based on Logistic Regression (LR) model, Random Forest (RF) model, Support Vector Machine (SVM) model, Gradient Boosting Decision Tree (GBDT) model and eXtreme Gradient Boosting (XGBoost) model, and selected the optimal model from them. ResultsSeveral 21-gene breast cancer distant metastasis prediction models were constructed, with the best performance of the model constructed based on the random forest model. This model accurately predicted the emergence of distant metastases from breast cancer, with an accuracy of 93.6 %, an F1-score of 88.9 % and an AUC value of 91.3 % on the validation set. ConclusionOur findings have the potential to be translated into a point-of-care prognostic analysis to reduce breast cancer mortality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call