Abstract

Recent years have seen a significant increase in study interest in the areas of predicting student performance, avoiding failure, and identifying the variables affecting student dropout. One important indicator in online and open distance learning courses is the student dropout rate. We purpose the naive bayes classification method to construct the student dropout prediction using naive bayes. This work examines the critical topic of forecasting student dropout rates in higher education using machine learning approaches, with a particular emphasis on the random forest algorithm and the naive bayes algorithm. The study's goal is to properly anticipate dropout rates using data mining methods and machine learning algorithms after conducting a thorough evaluation of existing literature and approaches. The systematic method consists of data collection from a Kaggle dataset, data preparation to solve class imbalance via SMOTE oversampling, and algorithm selection. Random forest and naive Bayes approaches outperform other machine learning algorithms in terms of accuracy, sensitivity, specificity, and precision. The study underscores the importance of considering diverse factors such as demographic data, socioeconomic factors, and academic performance in dropout prediction models. The implications of this research extend beyond academia, with the potential to inform proactive interventions and support systems, ultimately leading to improved student outcomes and institutional effectiveness. According to this paper, the paper outputs that for the binary classification on the data set used in this project has best performed with Naive Bayes and Random Forest Algorithm with SMOTE oversampling. Keywords- SMOTE oversampling, machine learning, Random forest, naive bayes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.