Abstract

Predicting students' academic performance is of paramount importance to educational institutions. If students' academic performance is predicted course-wise, semester-wise and year-wise then it will be helpful for students to a great extent. In this research article, we generate twenty nine dataset with the help of students' result analysis. Datasets 1 to 9 is the datasets of students' result in first attempt while datasets 10 to 19 represent datasets of students after passing all courses of semesters and datasets 20 to 29 is overall dataset of students' result analysis. To predict the course-wise, semester-wise and year-wise performance of students, we developed a framework titled as ECSYAPPS (Educational CourseSemesterYear-wiseAcademic Performance Prediction System) based on classification techniques and designed algorithm for analyzing students' performance in education sector. This ECSYAPPS predicts the course-wise, semester-wise and year-wise grade of students. Fifteen classification algorithms such as Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), Support Vector Machines (SVM), Linear Support Vector Classification (LSVC), K-Nearest Neighbors (KNN), Gradient Boosting (GB), Adaptive Boosting (AdaBoost), Bagging, Extreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Linear Discriminant Analysis (LDA), and Stochastic Gradient Descent Classifier (SGDC) are selected and applied on 29 datasets and compared on basis of performance parameters such as Accuracy, Precision, Recall, F1-score and Mean Absolute Error (MAE). These fifteen classification algorithms are under ten classifiers such as Linear Model, Naive Bayes, Tree, Support Vector Machine, Nearest Neighbors, Ensemble, xgboost, lightgbm, Discriminant Analysis and catboost. If the accuracy of two more classification algorithms for a dataset is same then in that case Precision, Recall, F1-score and Mean Absolute Error (MAE) are compared for deciding the best classification algorithm for the dataset. In this way, best classification algorithm is selected for each of 29 datasets. It is found that classifiers xgboost works best for eleven datasets while ensemble techniques Gradient Boosting and AdaBoost work best for two datasets and six datasets respectively among 29 datasets. Other classification algorithms such as Decision Tree, LightGBM, LDA, and KNN are noted to be best classification algorithm for two, four, three, and two datasets respectively. This framework is tested on new eight datasets related to students' result two methods such as K-Fold Cross-validation and Train-Validation-Test method. The results of this framework on new datasets shows that accuracy obtained on test dataset or validation dataset as compared to the accuracy obtained on old dataset is less than 6%. This framework will be helpful for students as well instructor. For students, it will help them to improve the performance of difficult courses from students'point of view in the examination while faculty can use this framework to improve pedagogical practices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call