The attrition rate of students in higher education is a worldwide issue that profoundly affects both individuals and institutions. Students who fail to complete their studies often encounter economic and social difficulties, while educational institutions suffer a deterioration in reputation and operational efficacy. This paper proposes the creation of a prediction model utilizing the XGBoost algorithm to assess students' academic progress and dropout risk. The model incorporates several elements, such as academic, demographic, and socio-economic, to yield comprehensive insights into students' educational trends. This research utilizes the Predict Students' Dropout and Academic Success dataset, comprising 4,424 data points and 36 attributes. The data underwent normalization via StandardScaler and was divided into five scenarios for training and testing, ranging from a 50:50 to a 90:10 split. The evaluation of the model was conducted utilizing accuracy, precision, recall, and F1-Score criteria. The findings indicate that the model attains peak performance in the 80:20 scenario, exhibiting 88% precision and an 81% F1-Score, signifying an ideal equilibrium between predictive accuracy and risk identification capability. This study demonstrates that XGBoost can serve as a dependable predictive instrument to aid decision-making in the education sector. These findings establish a foundation for formulating targeted interventions aimed at enhancing student retention. Subsequent study may investigate the use of real-time data and sophisticated models to enhance predictive accuracy.
Read full abstract