Having a healthy baby is a dream for mothers. Unfortunately, high maternal and fetal mortality has become a vital problem that requires early risk detection for pregnant women. A cardiotocograph examination is necessary to maintain maternal and fetal health. One method that can solve this problem is classification. This research analyzes the optimal use of k values and distance measurements in the k-NN method. This research expects to become the primary reference for other studies examining the same dataset or developing k-NN. A selection feature is needed to optimize the classification method, particularly for improving accuracy results. This study used the cardiotocography dataset from cardiotocograph examinations related to fetal conditions. The cardiotocography dataset consisted of 2,126 records with 22 features and variables. It also had three classification classes, normal, suspect, and pathological, from the Universal Child Immunization Machine Learning Repository website. It employed the K-Nearest Neighbor (k-NN) method and the backward elimination feature with ordinary least squares regression. The test in this research applied the scenarios of three distance calculations, i.e., Euclidean distance, Manhattan distance, and Minkowski distance, as well as four variations of k values. Evaluation of each scenario indicated the accuracy of the confusion matrix and execution time. This study compared K-Nearest Neighbor (k-NN) and Backward Elimination methods with K-nearest neighbor (k-NN) without selection features. The best accuracy of the Backward Elimination and K-Nearest Neighbor (K-NN) methods was 91%, as was the K-Nearest Neighbor (k-NN) method without selection features. Both had similar k values (k = 3) and Manhattan distance. The backward elimination method reduced the number of features from 22 to 14. Meanwhile, the execution times of the Backward Elimination and K-Nearest Neighbor (k-NN) methods got better results as each distance averaged 26.54, 19.23, and 68.09 seconds. K-Nearest Neighbor (k-NN) execution times without selection features were 26.83, 19.39, and 68.84, respectively. In conclusion, backward elimination did not increase accuracy because it yielded the same accuracy. However, backward elimination and K-nearest Neighbor (k-NN) produced faster results, with differences of 29%, 16%, and 75%, respectively.
Read full abstract