Abstract

This study focused on predicting at-risk groups of students at the Open University (OU), a UK university that offers distance-learning courses and adult education. The research was conducted by drawing on publicly available data provided by the Open University for the year 2013–2014. The semester’s time series was considered, and data from previous semesters were used to predict the current semester’s results. Each course was predicted separately so that the research reflected reality as closely as possible. Three different methods for selecting training data were listed. Since the at-risk prediction results needed to be provided to the instructor every week, four representative time points during the semester were chosen to assess the predictions. Furthermore, we used eight single and three integrated machine-learning algorithms to compare the prediction results. The results show that using the same semester code course data for training saved prediction calculation time and improved the prediction accuracy at all time points. In week 16, predictions using the algorithms with the voting classifier method showed higher prediction accuracy and were more stable than predictions using a single algorithm. The prediction accuracy of this model reached 81.2% for the midterm predictions and 84% for the end-of-semester predictions. Finally, the study used the Shapley additive explanation values to explore the main predictor variables of the prediction model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call