Abstract

This study analyzed institutional factors affecting college student retention rates using a machine learning predictive model. The data of 2022, 2021, and 2022 university information disclosure data consisting of all four-year universities, capital region, and non-capital region were analyzed with the randomForest algorithm in R. First, the study findings indicate that the randomForest model demonstrated an accuracy rate of approximately 70%. Second, 10 key variables were identified in the overall university data set and 9 key variables in the capital region and non-capital region university data sets. However, there were some differences in the composition and ranking of the key factors in the data sets. Third, the partial dependence plots reveal the value ranges of the variables where the retention rate either decreased or remained stable. Institutions may apply these results by utilizing randomForest machine learning techniques for data-driven decision-making in formulating university retention policies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call