Abstract

More than 85% of women die from cervical cancer (CC) in developing countries, which is one of the major causes of premature mortality worldwide. Early diagnosis and treatment are essential to reducing cancer mortality as they lead to greater improvements and longer patient survival. CC is associated with several risk factors. The data set may contain redundant, irrelevant, and unreliable features, due to which we may get unreliable results in the process of classification. Feature selection techniques could be observed as a probable solution for this type of problem. In this study, a Novel Genetic-inspired Binary Firefly Algorithm with Random Forest (NGBFA-RF) is proposed as a solution for dimensionality reduction and to find a good set of features to be used in the process of classification. This study was based on the CC Risk Factors, which contain 32 risk factors and four dependent variables. The imbalance of data was alleviated by using the SMOTE data sampling technique. The proposed method’s main goal is to improve predictive accuracy with a small number of features, thereby reducing classification errors. The proposed novel algorithm based on the Firefly Algorithm with genetic operations has shown better results than the other existing models. The efficacy of the proposed novel algorithm has been assessed based on accuracy, recall, precision, [Formula: see text]-score, and AUC–ROC curve values. Results have exposed that a reduced feature set is helpful in getting a better accuracy of 98% with only five features in the classification with NGBFA-RF along with the hybrid ensemble classifier.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call