Objectives: Universities accumulate huge amount of student’s data in electronic form. Based on the information stored in the database filtering a data on certain criteria becomes difficult, when executed manually. Hence implementing tools that analyses the data in statistical, descriptive or computational ways are quite important to be considered. Methods/ Statistical Analysis: This study presents an analysis on top ten machine learning algorithms used in classification and prediction. WEKA tool is used to conduct the experiment to know the accuracy and other result parameters on evaluating the categorical prediction of student performance. Also an analysis has been done to estimate the parameters based on the number of samples. Findings: The comparative analysis on the classification accuracy of around 12 classifiers of WEKA involving Rep Tree, Naive Bayes, J48, Bagging, lBK, Multilayer Perceptron, Random Forest, Random Tree, Stacking, AdaBoost, Logistic and SMO were analysed on datasets in varying number of instances. Based on the results obtained best 5 methods are chosen and compared on the entire dataset for prediction results. Ten machine learning algorithms were considered wherein the results such as accuracy in classification, Kappa statistic, and Mean absolute error are considered and compared. Bagging, Random Forest, lBK, Random Tree was filtered at the first level based on kappa statistic. In the second level filter based on accuracy lBK, Random Tree was considered as the final suitable models for the provided dataset. Application/Improvements: Developing a questionnaire among students and teachers is to be done to evaluate and predict the results in various angles based on various parameters. The positive factors and the negative factor contribution for the result of the institution are to be analysed. Keywords: Educational Data Mining, Analysis, Prediction, Machine Learning, Student Performance, WEKA
Read full abstract