A Cascading Ensemble with Custom Subset Generation and Multi-level Fusion for Enhanced Breast Cancer Detection

Vandana Lingampally ,K.Radhika

doi:10.36948/ijfmr.2023.v05i03.3662

Vandana Lingampally , K.Radhika

Open Access

PDF Available

https://doi.org/10.36948/ijfmr.2023.v05i03.3662

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The ensemble learning technique is a powerful method that combines multiple machine learning models to address their individual limitations and create an optimized predictive model. By harnessing the strengths of each model, ensemble learning significantly enhances overall performance, making it particularly valuable for classification tasks across diverse domains. In this study, an efficient three-level stacking ensemble approach is proposed for diagnosing breast cancer. At the first level, a diverse set of base learners is employed, encompassing decision trees, logistic regression, k-nearest neighbors, support vector machines, and Gaussian naive Bayes. Each first-level learner is trained on distinct subsets of the training data with 10-fold cross-validation, thereby ensuring model robustness and mitigating the risk of overfitting. Transitioning to the second level, sophisticated ensemble models including Adaboost, GBM, and Random forest are introduced. These second-level learners undergo training using the validation predictions generated by the first-level models, while incorporating the top six informative features extracted from the dataset. To enhance their performance and mitigate overfitting, the models are optimized through the application of 10-fold cross-validation. In order to achieve further refinement of the ensemble, a third-level learner, represented by a deep neural network (DNN), is introduced. This DNN is trained on the validation predictions obtained from both the first and second levels, facilitating the capturing and synthesis of the collective knowledge of the ensemble. By leveraging the inherent capabilities of deep learning, the DNN maximizes the amalgamated benefits derived from the preceding levels, resulting in a substantial enhancement of the overall predictive power. The effectiveness of the ensemble approach is evaluated using well-established performance metrics, including accuracy, sensitivity, and specificity, which are derived from the analysis of the confusion matrix. The results incontrovertibly demonstrate that the ensemble models surpass individual machine learning models in accurately detecting breast cancer.

Full Text