Abstract

Student dropout issue is a major concern among the academics and management of the university. The higher rate of student dropout impacted the university reputation such as reducing student enrollment, affecting the revenue of the university, financial losses for the country, and increase the existence of a social problem among the students. In this study, 2 popular classifiers were utilized to predict the student dropout namely decision tree and logistic regression model respectively. Several sets of experimental setting were employed which include three set of data partitioning - along with different types of decision tree and regression model. As for the logistic regression model, different data imputation and transformation method was tested to ensure that the model built is valid. A total of 7706 student data extracted from one of the private universities in Malaysia database (between year 2018-2019) to assess the capability of the classifier. The classifier performance is evaluated using machine learning performance measure of accuracy and misclassification rate. The result indicates that, decision tree - chi-square (2 branches) achieved slightly better classification performance of 89.49% on 80/20 data partitioning. The chosen model also identified the most important variable for accurate prediction of student dropout. Application of this model has the potential to accurately predict at risk student and to reduce student dropout rates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call