Abstract

Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.

Highlights

  • IntroductionCervical cancer complications are often associated with the infection of human papillomavirus

  • One form of gynecological cancer is cervical cancer

  • The ten features extracted by Chi-square were used for all models (SVM, multilayer perceptron (MLP), logistic regression (LR), naïve Bayes, and K-nearest neighbors (KNN)), and all four target variables (Biopsy, Schiller, Hinselmann, and Cytology)

Read more

Summary

Introduction

Cervical cancer complications are often associated with the infection of human papillomavirus. It is a common debilitating disease among women worldwide. It is the third most regularly diagnosed cancer (~485,000 cases) and the fourth worldwide driving cause of cancer-related deaths (236,000) each year [1,2]. The main cause of cervical cancer is persistent infection by oncogenic human papillomavirus (HPV). The danger of cervical cancer can be expanded by 2 to 3 times if an HPV-contaminated patient smokes [6]. In case of multiple pregnancies, female HPV-infected patients without pregnancies have lower occurrence of cervical cancer than those with more than one full-term pregnancy [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call