Abstract

Cervical cancer remains an important reason of deaths worldwide because effective access to cervical screening methods is a big challenge. Data mining techniques including decision tree algorithms are used in biomedical research for predictive analysis. The imbalanced dataset was obtained from the dataset archive belongs to the University of California, Irvine. Synthetic Minority Oversampling Technique (SMOTE) has been used to balance the dataset in which the number of instances has increased. The dataset consists of patient age, number of pregnancies, contraceptives usage, smoking patterns and chronological records of sexually transmitted diseases (STDs). Microsoft azure machine learning tool was used for simulation of results. This paper mainly focuses on cervical cancer prediction through different screening methods using data mining techniques like Boosted decision tree, decision forest and decision jungle algorithms as well performance evaluation has done on the basis of AUROC (Area under Receiver operating characteristic) curve, accuracy, specificity and sensitivity. 10-fold cross-validation method was utilized to authenticate the results and Boosted decision tree has given the best results. Boosted decision tree provided very high prediction with 0.978 on AUROC curve while Hinslemann screening method has used. The results obtained by other classifiers were significantly worse than boosted decision tree.

Highlights

  • Cancer is a dangerous disease in which group of abnormal cells develops hysterically by avoiding the usual rules of cell division

  • The imbalanced data set problem in which cancerous patients were too small as compared to non-cancerous patients has been resolved by using Synthetic Minority Oversampling Technique (SMOTE) method

  • The prediction ability of the boosted decision tree measured by the AUROC curve value which outperformed decision forest and decision jungle

Read more

Summary

Introduction

Cancer is a dangerous disease in which group of abnormal cells develops hysterically by avoiding the usual rules of cell division. In 90% developed countries treatment services are available compared to less than 26% of low income countries. Millions of early deaths among women is due to lung and breast cancer but cervical cancer is most dangerous because it is only diagnosed in females. An ideal screening test is the one that is least incursive, easy to achieve, acceptable to subject, cheap and effective in diagnosing the disease process in its early incursive stage when the treatment is easy for ailment. There are four screening methods including cervical cytology called Pap smear test, biopsy, Schiller and Hinslemann [10]. Lugol's iodine is used for visual inspection of cervix after smearing Lugol's iodine detection rate of doubtful region over the cervix, this is known as Schiller test [13]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.