Abstract

Cervical Cancer (CC) is a substantial reason of death midst middle-aged women throughout the world, specifically in developing countries, with approximately 85% of deaths. CC patients can be healed if spotted in the early stages. As no symptoms appear in the initial stages, it has become a challenge for investigators to predict the disease in the early stages. Several machine learning algorithms have been used to predict CC since the last decade. Instead of using a single classifier for the prediction, ensemble methods give accurate results, creating and combining multiple models to produce improved results. In this study, we built a hybrid ensemble classifier, 'A Robust Model Stacking: A Hybrid Ensemble,' in which a homogenous ensemble will be performed on a pool of classifiers in the base level followed by a heterogenous ensemble using the majority voting (soft) algorithm to get the final prediction of the new data. The dataset used in this study contains 858 instances with 32 features built from the risk factors and four targets made from the CC diagnosis tests. We have solved the data imbalance problem using an oversampling technique called SMOTE. The model's efficiency was evaluated based on the accuracy, recall, f1-score, precision, and AUC-ROC curve metrics for all four target variables in the dataset. The proposed Biopsy method's accuracy is 98%, Hinselmann is 97%, Schiller is 96.09%, and Citology is 93%. We implement ensemble learning in this study to increase prediction accuracy and decrease bias and variance. We carried the experiments out using the Python language in Google Colab and Jupyter notebooks. The experimental results revealed that our proposed hybrid ensemble learning records a remarkable accuracy for all four target variables.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call