Abstract

Distance education supports lifelong learning and empowers individuals in rapidly changing societal conditions, yet it encounters high dropout rates due to a range of individual and societal obstacles. This study addresses the challenge of creating a practical prediction model by analyzing extensive real-world time-point data from a well-established online university in Seoul. Covering 144,540 instances from 2018 to 2022, the study integrates diverse datasets to compare the accuracy of models based on longitudinal, semester-wise, and gender-specific datasets. The demographic, academic, and online metrics identified significant dropout indicators, including age (particularly when binned), residential area, specific occupations, GPA, and LMS log metrics, using a stepwise backward elimination process. The study revealed that, despite societal changes, recent data from the last four semesters can be effectively used for stable prediction training. Gender-based analysis showed different factors influencing dropout risk for males and females. The Light Gradient Boosting Machine (LGBM) algorithm excelled in prediction accuracy, with the ROC-AUC metric affirming its superiority. However, logistic regression also showed its competitive performance and offered in-depth interpretation. In South Korea's distinct educational setting, merging advanced algorithms like LGBM with the interpretive strength of logistic regression is key for effective student support strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call