The purpose of this study was to develop random forest classifier models (a type of supervised machine learning algorithm) that could (1) predict students who will or will not complete the DVM degree requirements and (2) identify the top predictors for academic success and completion of the DVM degree. The study utilized Ross University School of Veterinary Medicine student records from 2013 to 2022. Twenty-four variables encompassing demographic (eg, age, race), academic (eg, grade point average), and financial aid (eg, outstanding balances) data were assessed in 11 cross-validated random forest machine learning models. One model was built assessing all years of data and 10 individual models were developed for each enrollment year to compare how the top predictors of success varied among the years. Consistently, only academic and financial factors were identified as being features of importance (predictors) in all models. Demographic factors such as race were not important for predicting student success. All models performed very well to excellently based on multiple performance metrics including accuracy, ranging from 96.1% to 99%, and the areas under the receiver operating characteristic curves, ranging from 98.1% to 99.9%. The random forest algorithm is a powerful machine learning prediction model that performs well with veterinary student academic records and is customizable such that variables important to each veterinary school's student population can be assessed. Identifying predictors of success as well as at-risk students is essential for providing targeted curricular interventions to increase retention and achieve timely completion of a DVM degree.
Read full abstract