A multi-model machine learning framework for breast cancer risk stratification using clinical and imaging data

Lu Miao,Zidong Li,Jinnan Gao

doi:10.1177/08953996241308175

Lu Miao, Zidong Li + Show 1 more

https://doi.org/10.1177/08953996241308175

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Purpose This study presents a comprehensive machine learning framework for assessing breast cancer malignancy by integrating clinical features with imaging features derived from deep learning. Methods The dataset included 1668 patients with documented breast lesions, incorporating clinical data (e.g., age, BI-RADS category, lesion size, margins, and calcifications) alongside mammographic images processed using four CNN architectures: EfficientNet, ResNet, DenseNet, and InceptionNet. Three predictive configurations were developed: an imaging-only model, a hybrid model combining imaging and clinical data, and a stacking-based ensemble model that aggregates both data types to enhance predictive accuracy. Twelve feature selection techniques, including ReliefF and Fisher Score, were applied to identify key predictive features. Model performance was evaluated using accuracy and AUC, with 5-fold cross-valida tion and hyperparameter tuning to ensure robustness. Results The imaging-only models demonstrated strong predictive performance, with EfficientNet achieving an AUC of 0.76. The hybrid model combining imaging and clinical data reached the highest accuracy of 83% and an AUC of 0.87, underscoring the benefits of data integration. The stacking-based ensemble model further optimized accuracy, reaching a peak AUC of 0.94, demonstrating its potential as a reliable tool for malignancy risk assessment. Conclusion This study highlights the importance of integrating clinical and deep imaging features for breast cancer risk stratification, with the stacking-based model.

Full Text