Abstract

Many research studies confirmed that the application of machine learning methods for predicting students at risk of academic failure can be beneficial for improving student performance in higher education institutions. Simultaneously, these studies emphasised the importance of the input data quality, its precise pre-processing and thoroughly selected features on final performance metrics of the applied classification algorithms. However, the stand-alone application of well-known classification algorithms still has performance limitations, which can be outperformed using ensembles of classifiers. Therefore, this paper aims to evaluate the overall contribution of the selected ensemble methods to the early prediction of students at risk of dropping out of learning courses. All phases of CRISP-DM methodology are described in this study to explain the complexity of the problem of the practical adoption of voting-based heterogeneous ensemble methods in the case of data obtained from the course of the university learning management system compared to traditional classification algorithms enhanced by ensemble learning with AdaBoost and XGBoost methods. As a result, the study confirmed that the features which characterise the students’ performance and their interactions in the course can be considered the most significant for the identification of the students at risk of dropping out. Moreover, in the case of advanced ensemble algorithms, the overall accuracy, recall, precision, and f1 score were improved by 2-4%. The proposed method has demonstrated exceptional performance. Finally, the paper summarises the findings and their implications for effective prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call