Breast cancer remains a prevalent global health issue, accounting for approximately 2.3 million new cases and 670,000 deaths worldwide in 2022. Early detection and accurate diagnosis are crucial to improving patient outcomes, as delayed identification can lead to severe complications. Advances in machine learning (ML) have facilitated improvements in cancer diagnosis, with various algorithms enhancing predictive accuracy. This study proposes a novel ensemble model for breast cancer classification, utilizing 31 features from the University of Wisconsin Breast Cancer dataset. We applied six algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression, Random Forest, Gradient Boosting, and XGBoost—and combined them with ensemble techniques, specifically Hard Voting, to develop a high-accuracy model. The model was evaluated on classification performance metrics, achieving improvements in accuracy, precision, recall, and F1 score. Results indicate that the proposed ensemble model outperforms individual classifiers and other ensembles, showing potential as a reliable tool for early breast cancer detection.
Read full abstract