Abstract. Cervical cancer impacts the female reproductive organs and stands as the second most common cancer among women worldwide. The World Health Organization (WHO) reports that annually, approximately 500,000 women are diagnosed with cervical cancer, and about 300,000 die from it. Many of these deaths result from insufficient early detection and preventive measures. There are four primary screening techniques for detecting cervical cancer cells: Hinselmann, Schiller, Cytology, and Biopsy. In this study, patient health history data is analyzed using the KNN algorithm, which is further optimized with Adaboost and PSO techniques. These optimization strategies are evaluated to identify the most precise model for detecting patterns in cervical cancer patients and predicting their screening outcomes. This study employs the RapidMiner tool. Findings reveal that the KNN algorithm effectively performs multilabel classification, and when optimized with PSO, there is a slight improvement in accuracy.Purpose: The aim of this research is to assess the performance of the K-Nearest Neighbor (KNN) algorithm in multilabel classification of cervical cancer and to optimize it using Adaboost and Particle Swarm Optimization (PSO) techniques. This research is significant as it offers a potentially more accurate diagnostic method for detecting cervical cancer using medical records.Methods/Study design/approach: The Cervical Cancer Risk Classification dataset from Kaggle was used in this study. Data preprocessing was conducted before applying the KNN algorithm. The KNN algorithm's performance was evaluated using a 10-fold cross-validation method, and results were measured using the Confusion Matrix. Additionally, the KNN algorithm was optimized using Adaboost and PSO to assess improvements in accuracy.Result/Findings: Experimental results indicated that the KNN algorithm achieved optimal accuracy with k=5, reaching 95.81%, 91.26%, 94.64%, and 93.01% for Hinselmann, Schiller, Cytology, and Biopsy targets, respectively. Adaboost did not significantly improve accuracy, while PSO slightly enhanced the Hinselmann target accuracy from 95.81% to 95.92%. The average training time for this experiment was around two minutes. These results demonstrate the effectiveness of the KNN algorithm in conducting multilabel classification for cervical cancer diagnosis.Novelty/Originality/Value: This research demonstrates that optimizing the KNN algorithm with PSO can enhance accuracy, though not significantly. This suggests potential for further development to improve cervical cancer diagnostic accuracy. Testing the model with the latest data and optimizing parameters may lead to better models and useful tools for early cervical cancer diagnosis.
Read full abstract