Machine Learning-Based Prediction of Distant Recurrence in Invasive Breast Carcinoma Using Clinicopathological Data: A Cross-Institutional Study.

Kristen E. Muller,Shivashankar H. Nagaraj,Adrienne A. Workman,Shrey S. Sukhadia

doi:10.3390/cancers15153960

Kristen E. Muller, Shivashankar H. Nagaraj

Open Access

https://doi.org/10.3390/cancers15153960

Copy DOI

Abstract

Breast cancer is the most common type of cancer worldwide. Alarmingly, approximately 30% of breast cancer cases result in disease recurrence at distant organs after treatment. Distant recurrence is more common in some subtypes such as invasive breast carcinoma (IBC). While clinicians have utilized several clinicopathological measurements to predict distant recurrences in IBC, no studies have predicted distant recurrences by combining clinicopathological evaluations of IBC tumors pre- and post-therapy with machine learning (ML) models. The goal of our study was to determine whether classification-based ML techniques could predict distant recurrences in IBC patients using key clinicopathological measurements, including pathological staging of the tumor and surrounding lymph nodes assessed both pre- and post-neoadjuvant therapy, response to therapy via standard-of-care imaging, and binary status of adjuvant therapy administered to patients. We trained and tested four clinicopathological ML models using a dataset (144 and 17 patients for training and testing, respectively) from Duke University and validated the best-performing model using an external dataset (8 patients) from Dartmouth Hitchcock Medical Center. The random forest model performed better than the C-support vector classifier, multilayer perceptron, and logistic regression models, yielding AUC values of 1.0 in the testing set and 0.75 in the validation set (p < 0.002) across both institutions, thereby demonstrating the cross-institutional portability and validity of ML models in the field of clinical research in cancer. The top-ranking clinicopathological measurement impacting the prediction of distant recurrences in IBC were identified to be tumor response to neoadjuvant therapy as evaluated via SOC imaging and pathology, which included tumor as well as node staging.

Full Text