Abstract

In several areas, including education, the use of machine learning, such as artificial neural networks, has resulted in significant improvements in predicting tasks. The opacity of these models is one of the problems with their use. Prediction models that may offer valuable insights while still being simple to comprehend are preferred by decision-makers in education. Hence, this study suggests an approach that improves the previous student performance prediction by enhancing performance and explaining why a student’s performance is attaining a certain score. A prediction model was proposed and tested using machine learning models. Our models outperform previous work models developed on the same dataset. Using a combined framework of data level and algorithm approaches, the proposed model achieves an accuracy of over 98%, inplying a 20.3% improvement compared with previous work models. As a balancing technique for upsampling data, we use the default strategy of synthetic minority oversampling technique (SMOTE) to oversample all classes to the number of examples in the majority class. We also use ensemble methods. For tuning the parameters, we use a simple grid search algorithm provided by scikit to estimate the optimal parameters of our model. This hyperparameter optimization along with a ten-fold cross-validation process demonstrates the dependability of the novel model. In addition, a novel visual and intuitive technique is used to help determine which factors most influence the score which helps to interpret and understand the entire model and visualizes feature attributions at the observation level for the machine learning model. Therefore, SHAP values are a powerful tool that should be incorporated within the student performance prediction framework by obtaining the prediction and explanation created through the experiment, educators can recognize students at risk early and provide suitable exhortation in an auspicious manner.

Highlights

  • D ATA mining techniques play an essential role in many application fields, such as business analytics, security analytics, financial analytics, and learning analytics

  • These results were obtained via the use of strategies and techniques, such as synthetic minority oversampling technique (SMOTE), hyperparameter optimization, and cross-validation process, which demonstrate the dependability of the novel model

  • The SHAP value and the associated visualizations provide a view into the inner operations of the prediction models used and increase the transparency of the model

Read more

Summary

Introduction

D ATA mining techniques play an essential role in many application fields, such as business analytics, security analytics, financial analytics, and learning analytics. We are primarily concerned with applications of data mining in the education environment. This area of research focuses on the design and application of algorithms on educational datasets to have a good understanding of students and their educational system [1]. A primal application of educational data mining (EDM) is investigating the student learning process and predicting student performance to improve educational practices. In this context, we are attempting to approximate student performance, experience, ranking, or grade [2] by pulling out features from traditional recorded or logged data. One of the most promising research fields of information technology is

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call