Abstract

This paper combines machine learning with economic theory in order to analyse high school dropout. It provides an algorithm to predict which students are going to drop out of high school by relying only on information from 9th grade. This analysis emphasizes that using a parsimonious early warning system – as implemented in many schools – leads to poor results. It shows that schools can obtain more precise predictions by exploiting the available high‐dimensional data jointly with machine learning tools such as Support Vector Machine, Boosted Regression and Post‐LASSO. Goodness‐of‐fit criteria are selected based on the context and the underlying theoretical framework: model parameters are calibrated by taking into account the policy goal – minimizing the expected dropout rate ‐ and the school budget constraint. Finally, this study verifies the existence of heterogeneity through unsupervised machine learning by dividing students at risk of dropping out into different clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.