Abstract

The study aims to evaluate the performance of Random Forest algorithms in data mining education by optimizing graduation on-time (GOT) predictions using imbalanced data methods. Methods used to handle imbalanced data include random under-sampling (RUS), random over-sampling (ROS), hybrids of RUS and ROS, synthetic minority over-sampling techniques for nominal classes (SMOTE-NC), and hybrids of SMOTE-NC and RUS. After applying these methods, studies analyze their performance on training and testing data. The research findings show that on training data, the RUS-ROS hybrid showed the best performance compared to other methods, while the SMOTENC and RUS hybrid techniques showed the best performance on testing data based on AUC values. The research showed that the use of an imbalanced data method significantly improved the ability of Random Forest algorithms to predict graduation on time (GOT) in the context of educational data. We discuss the implications for educational data mining applications and provide suggestions for future research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call