Abstract

Missing value often occur in classification method that is caused by information on the object is not given, it is difficult to find, or because of the information is unavailable. It will cause the decrement of accuracy and data quality during it is analyzed. Correlation approach was conducted because it should be known the existence and the strength of variable correlation in related to an object or subject studied. Classification method used is K-NN method. It is because this method is included in classification method that has strong consistency by finding the case through calculation on the closeness between the case with the old one based on K value or the nearest neighbor. Correlation approach can be done to overcome missing value, as evidenced by the increasing classification results and the loss of unclassified data. Questionnaire as a measuring tool, the questionnaire contains some questions given to the respondent, from the results of questionnaires conducted data analysis to determine the level of correlation of data backup. After getting the level of backup data correlation, then the backup data is used as a substitute for missing data value. Before the replacement of data there is missing value classification of 500 data classified natural science major 88 students, social science major 126 students, the language major 271 students, and unclassified / false 15 students. After the replacement of data there is missing value from 500 data, it can be classified into natural science major 102 students, social science major 316 students, the language major 82 students, and no unclassified data. Based on the experimental results, the value of k = 3, 5, 7, 9, and 11. It can be seen that k = 5 has a high accuracy of 97.0%, so in this study majors using K-NN method set k value used is 5.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call