Abstract
Breast Cancer is the most common cancer found in women and the death rate is still in second place among other cancers. The high accuracy of the machine learning approach that has been proposed by related studies is often achieved. However, without efficient pre-processing, the model of Breast Cancer prediction that was proposed is still in question. Therefore, this research objective to improve the accuracy of machine learning methods through pre-processing: Missing Value Replacement, Data Transformation, Smoothing Noisy Data, Feature Selection / Attribute Weighting, Data Validation, and Unbalanced Class Reduction which is more efficient for Breast Cancer prediction. The results of this study propose several approaches: C4.5 - Z-Score - Genetic Algorithm for Breast Cancer Dataset with 77,27% accuracy, 7-Nearest Neighbor - Min-Max Normalization - Particle Swarm Optimization for Wisconsin Breast Cancer Dataset - Original with 97,85% accuracy, Artificial Neural Network - Z-Score - Forward Selection for Wisconsin Breast Cancer Dataset - Diagnostics with 98,24% accuracy, and 11-Nearest Neighbor - Min-Max Normalization - Particle Swarm Optimization for Wisconsin Breast Cancer Dataset - Prognostic with 83,33% accuracy. The performance of these approaches is better than standard/normal machine learning methods and the proposed methods by the best of previous related studies.
Highlights
Breast Cancer is the most common cancer found in women and the death rate is still in second place among other cancers
The results of this study propose several approaches: C4.5 - Z-Score - Genetic Algorithm for Breast Cancer Dataset with 77,27% accuracy, 7-Nearest Neighbor Min-Max Normalization - Particle Swarm Optimization for Wisconsin Breast Cancer Dataset - Original with 97,85% accuracy, Artificial Neural Network - Z-Score - Forward Selection for Wisconsin Breast Cancer Dataset - Diagnostics with 98,24% accuracy, and 11-Nearest Neighbor - Min-Max Normalization - Particle Swarm Optimization for Wisconsin Breast Cancer Dataset - Prognostic with 83,33% accuracy
Malignant Breast Cancer using Data Mining Techniques,” J
Summary
Metode-metode ML yang digunakan untuk melakukan klasifikasi (prediksi) KP, yaitu ANN, SVM, C4.5, NB, dan K-NN. Untuk dataset lainnya menggunakan pendekatan pada (19) metode ML masing-masing. Selanjutnya n merupakan jumlah nilai yang berbeda dalam node. Algoritma NB (22) dapat pula digunakan untuk X4 klasifikasi, di mana P(Ck) adalah probabilitas dari class X5 ke-k = 1, 2, ..., l. Hasil evaluasi kinerja akurasi metode-metode ML setelah dilakukan MVR dan DT menunjukkan bahwa:. Pada dataset BCD, C4.5 memberikan kinerja yang terbaik dengan akurasi 73,78%; Algoritma k-NN dapat pula digunakan untuk 2. Pada dataset WBCDO, NB memberikan kinerja klasifikasi. Pada dataset WBCDD, ANN memberikan kinerja menggunakan Euclidean (24). 4. Pada dataset WBCDP, ANN memberikan kinerja yang terbaik dengan akurasi 81,82%. Hasil evaluasi kinerja akurasi metodemetode ML setelah dilakukan MVR dan DT
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have