To develop and validate machine learning models for human epidermal growth factor receptor 2 (HER2)-zero and HER2-low using MRI features pre-neoadjuvant therapy (NAT). Five hundred and sixteen breast cancer patients post-NAT surgery were randomly divided into training (n = 362) and internal validation sets (n = 154) for model building and evaluation. MRI features (tumour diameter, enhancement type, background parenchymal enhancement, enhancement pattern, percentage of enhancement, signal enhancement ratio, breast oedema, and apparent diffusion coefficient) were reviewed. Logistic regression (LR), support vector machine (SVM), k-nearest neighbour (KNN), and extreme gradient boosting (XGBoost) models utilized MRI characteristics for HER2 status assessment in training and validation datasets. The best-performing model generated a HER2 score, which was subsequently correlated with pathological complete response (pCR) and disease-free survival (DFS). The XGBoost model outperformed LR, SVM, and KNN, achieving an area under the receiver operating characteristic curve (AUC) of 0.783 (95% CI, 0.733-0.833) and 0.787 (95% CI, 0.709-0.865) in the validation dataset. Its HER2 score for predicting pCR had an AUC of 0.708 in the training datasets and 0.695 in the validation dataset. Additionally, the low HER2 score was significantly associated with shorter DFS in the validation dataset (hazard ratio: 2.748, 95% CI, 1.016-7.432, P = .037). The XGBoost model could help distinguish HER2-zero and HER2-low breast cancers and has the potential to predict pCR and prognosis in breast cancer patients undergoing NAT. HER2-low-expressing breast cancer can benefit from the HER2-targeted therapy. Prediction of HER2-low expression is crucial for appropriate management. MRI features offer a solution to this clinical issue.
Read full abstract