Comparison of Data Mining Classification Algorithms for Student Performance

Emny Harna Yossy,Yaya Heryadi,Lukas Lukas

doi:10.1109/tale48000.2019.9225887

Abstract

Student performance has an important role to measure student quality. Student quality can be measured through predictions of student performance. Prediction can be done using data mining techniques. One technique that can be used is the classification method. The research aims to find out which classification model has the best performance related to student performance data. The data used is taken from UCI Machine Learning, namely student performance. The study used seven methods, namely K-nearest neighbor, classification and regression trees, naïve bayes, adaboost, extratree, bernaoulli naïve bayes, random forest. The technology used to compare the seven methods uses Python programming. Testing the performance of methods using cross validation. The results of this study are the comparison of student performance classification algorithms on student math, namely K-Nearest Neighboring of 86.52%, classification and regression tests of 86.08%, naïve bayes of 84.78%, adaboost of 88.04%, extratree of 81.30%, bernaoulli naïve bayes of 79.34%, random forest E of 87.82%, random forest G of 89.78%. Based on these results we know that the best classification method is the random forest G of 89.78%.

Full Text