Abstract

A dropout early warning system enables schools to preemptively identify students who are at risk of dropping out of school, to promptly react to them, and eventually to help potential dropout students to continue their learning for a better future. However, the inherent class imbalance between dropout and non-dropout students could pose difficulty in building accurate predictive modeling for a dropout early warning system. The present study aimed to improve the performance of a dropout early warning system: (a) by addressing the class imbalance issue using the synthetic minority oversampling techniques (SMOTE) and the ensemble methods in machine learning; and (b) by evaluating the trained classifiers with both receiver operating characteristic (ROC) and precision–recall (PR) curves. To that end, we trained random forest, boosted decision tree, random forest with SMOTE, and boosted decision tree with SMOTE using the big data samples of the 165,715 high school students from the National Education Information System (NEIS) in South Korea. According to our ROC and PR curve analysis, boosted decision tree showed the optimal performance.

Highlights

  • The negative consequences of students’ dropping out of school are significant for both the individual and society

  • The present study aimed to improve the performance of a dropout early warning system: (a) by addressing the class imbalance issue using the synthetic minority oversampling techniques (SMOTE) and ensemble methods in machine learning; and (b) by evaluating the trained classifiers with both receiver operating characteristic (ROC) and PR curves

  • Given our specific goal of maximizing our chance of supporting the successful learning of all students, and minimizing the cost for intervention, the present study aimed to improve the performance of a dropout early warning system: (a) by addressing the class imbalance issue using the SMOTE and ensemble methods in machine learning; and (b) by evaluating the trained classifiers with both receiver operating characteristic (ROC) curves and precision–recall (PR) curves

Read more

Summary

Introduction

The negative consequences of students’ dropping out of school are significant for both the individual and society. The society suffers losses because the nation’s productive capacity could be undermined by the shortage of the skilled workforce, and the dropout students are more likely to be frequent recipients of welfare and unemployment subsidies [2]. Because of those negative consequences, students’ dropouts have long been considered as a serious educational problem by educators, researchers, and policymakers. The students at risk are likely to drop out without carefully considering the negative consequences of their decisions or without having an opportunity to consult with experts

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.