Optimising Student Performance Prediction Using DTC Enhanced with WOA and ECPO
Optimising Student Performance Prediction Using DTC Enhanced with WOA and ECPO
- Conference Article
4
- 10.1109/tale48000.2019.9225924
- Dec 1, 2019
Along with the increasingly fast development of elearning and the transformation of the traditional classroom, novel technology should be applied in human learning domain, that is called smart learning. Since deep learning has been quickly developed in recent years, a diversity of prediction methods have been successfully applied in many domains. Most of the recent studies about student's performance prediction mainly use machine learning methods like decision tree and k-nearest neighbors to discover the correlation between the student's features and performances. In this work, we explore the application of deep learning in student's performance prediction scenario. Inspired by some works in which the combination of deep learning with collaborative filtering is explored, we propose a novel method based on neural collaborative filtering for student's performance prediction which, unlike other recent works about student's performance prediction, does not require outer features of students. The main contribution of our work is that we explore a novel representation method for latent features in which the latent space is separated on several parts by their meanings which allows the model to learn a better latent representation for inference. Our results for real student's score prediction show that our proposed methods can outperform existing models.
- Research Article
- 10.63544/ijss.v4i1.117
- Mar 28, 2025
- Inverge Journal of Social Sciences
Accurately predicting student performance and identifying anomalies in academic datasets has become increasingly crucial for enhancing educational outcomes and enabling data-driven interventions in modern learning environments. Traditional statistical methods and conventional machine learning approaches often struggle with the multidimensional nature and increasing scale of contemporary student datasets, which encompass diverse academic, behavioral, and socio-demographic variables. This study explores advanced deep learning techniques; including Autoencoders for unsupervised anomaly detection, Recurrent Neural Networks with Long Short-Term Memory architectures for temporal pattern recognition, and Deep Neural Networks for comprehensive performance prediction to address these challenges. The proposed framework demonstrates significant improvements in detecting subtle performance anomalies that often precede academic difficulties, while simultaneously predicting longitudinal success patterns with greater accuracy than traditional methods. By leveraging the hierarchical feature learning capabilities of deep architectures, our system enables early identification of at risk students through continuous analysis of complex, nonlinear relationships in educational data, allowing institutions to implement timely, personalized interventions. Research studies have empirically validated the effectiveness of these models in educational contexts, showing superior performance in measuring student achievement patterns and predicting learning outcomes. The findings contribute to theoretical advancements in educational analytics but also provide practical insights for curriculum designers and policy makers seeking to optimize instructional strategies. Furthermore, the study establishes significant benchmarks for educational contexts by demonstrating how deep learning can enhance both teaching methodologies and student support systems through data-driven insights. This research makes a substantial contribution to the growing field of Educational Data Mining by proposing a robust deep learning framework that serves as both a predictive tool and a baseline for future studies in student performance analysis, while also addressing critical challenges in model interpretability and implementation scalability within real-world educational settings. References Abatal, A., Korchi, A., Mzili, M., Mzili, T., Khalouki, H., & Billah, M. E. (2025). A comprehensive evaluation of machine learning techniques for forecasting student academic success. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(1), 1-2. Acharya, A., & Sinha, D. (2014). Early prediction of students performance using machine learning techniques. International Journal of Computer Applications, 107(1), 37-43. Al-Fairouz, E. I., & Al-Hagery, M. A. (2020). Students performance: From detection of failures and anomaly cases to the solutions-based mining algorithms. International Journal of Engineering Research and Technology, 13(10), 2895-2905. Alam, A., & Mohanty, A. (2022). Predicting students’ performance employing educational data mining techniques, machine learning, and learning analytics. In International Conference on Communication, Networks and Computing (pp. 166-177). Springer. Alruwais, N., & Zakariah, M. (2023). Student-engagement detection in classroom using machine learning algorithm. Electronics, 12(3), 731. Bulusu, S., Kailkhura, B., Li, B., Varshney, P. K., & Song, D. (2020). Anomalous example detection in deep learning: A survey. IEEE Access, 8, 132330-132347. Gao, Y. (2025). Deep learning-based strategies for evaluating and enhancing university teaching quality. Computers and Education: Artificial Intelligence, 7, 100362. Ghanim, J., & Awad, M. (2025). An unsupervised anomaly detection in electricity consumption using reinforcement learning and time series forest-based framework. Journal of Artificial Intelligence and Soft Computing Research, 15(1), 5-24. Huang, A. Y., Lu, O. H., Huang, J. C., Yin, C. J., & Yang, S. J. (2020). Predicting students’ academic performance by using educational big data and learning analytics: Evaluation of classification methods and learning logs. Interactive Learning Environments, 28(2), 206-230. Hussain, S., & Khan, M. Q. (2023). Student-performulator: Predicting students’ academic performance at secondary and intermediate level using machine learning. Annals of Data Science, 10(3), 637-655. Hussain, S., Gaftandzhieva, S., Maniruzzaman, M., Doneva, R., & Muhsin, Z. F. (2021). Regression analysis of student academic performance using deep learning. Education and Information Technologies, 26(1), 783-798. Issah, I., Appiah, O., Appiahene, P., & Inusah, F. (2023). A systematic review of the literature on machine learning application of determining the attributes influencing academic performance. Decision Analytics Journal, 5, 100204. Kaggle. (n.d.). Students Performance in Exams. Retrieved from https://www.kaggle.com/datasets/spscientist/students-performance-in-exams/data Kamalov, F., Sulieman, H., & Santandreu Calonge, D. (2021). Machine learning based approach to exam cheating detection. PLOS ONE, 16(8), e0254340. López-García, A., Blasco-Blasco, O., Liern-García, M., & Parada-Rico, S. E. (2023). Early detection of students’ failure using machine learning techniques. Operations Research Perspectives, 11, 100292. Nassif, A. B., Talib, M. A., Nasir, Q., & Dakalbab, F. M. (2021). Machine learning for anomaly detection: A systematic review. IEEE Access, 9, 78658-78700. Pallathadka, H., Wenda, A., Ramirez-Asís, E., Asís-López, M., Flores-Albornoz, J., & Phasinam, K. (2023). Classification and prediction of student performance data using various machine learning algorithms. Materials Today: Proceedings, 80, 3782-3785. Pek, R. Z., Özyer, S. T., Elhage, T., Özyer, T., & Alhajj, R. (2022). The role of machine learning in identifying students at-risk and minimizing failure. IEEE Access, 11, 1224-1243. Riestra-González, M., del Puerto Paule-Ruíz, M., & Ortin, F. (2021). Massive LMS log data analysis for the early prediction of course-agnostic student performance. Computers & Education, 163, 104108. Shitaya, A. M., Wahed, M. E., Ismail, A., Shams, M. Y., & Salama, A. A. (2025). Predicting student behavior using a neutrosophic deep learning model. Neutrosophic Sets and Systems, 76, 288-310. Vaidya, A., & Sharma, S. (2024). Anomaly detection in the course evaluation process: A learning analytics–based approach. Interactive Technology and Smart Education, 21(1), 168-187. Wang, G., Han, S., Ding, E., & Huang, D. (2021). Student-teacher feature pyramid matching for anomaly detection. arXiv preprint arXiv:2103.04257.
- Research Article
113
- 10.1016/j.compedu.2020.104108
- Dec 24, 2020
- Computers & Education
Massive LMS log data analysis for the early prediction of course-agnostic student performance
- Research Article
5
- 10.1155/2022/2581951
- Jul 31, 2022
- Mathematical Problems in Engineering
Over the last few decades, there has been a gradual deterioration in higher education in all three areas: the academic setting (both staff and students), as well as research and development output (including graduates). All colleges and universities are essentially focused on improving management decision-making and educating pupils. High-quality higher education can be obtained through a variety of methods. One method is to accurately forecast pupils’ achievement in their chosen educational context. There are numerous prediction models from which to pick. While it is unclear whether there are any markers that can predict whether a kid will be an academic genius, a dropout, or an average performer, the researcher reports student achievement. This article presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data. BPNN is having better accuracy for classification and prediction of student academic performance.
- Conference Article
2
- 10.1109/icitacee55701.2022.9923971
- Aug 25, 2022
High dropout rate and low student performance were inevitable issues for educational institutions in many countries. Consequently, this study presents an automated technique to predict student performance and graduation using student data with separated and combined prediction method. The data was collected from an Indonesia university. Long Short-Term Memory (LSTM) and Gate Recurrent Units (GRU) as an outstanding model in handling sequence data was proposed in this study. According to our study, both LSTM and GRU have a great performance above 90% in predicting each task. The performance of both architecture was surpass each other depending on the corresponding task. In early prediction, the student graduation prediction can give a satisfiable performance since the first semester, despite having tradeoff in recall. Whereas in student performance prediction, the RMSE value was acceptable since the second semester. Overall, the performance of student performance and graduation prediction was better if used separated method than combine method. This pipeline work can be replicate and improve for similar task in other universities with feature adjustments based on data availability.
- Research Article
- 10.35377/saucis...1635558
- Mar 28, 2025
- Sakarya University Journal of Computer and Information Sciences
Early prediction of student performance is a critical and challenging task in the field of Educational Data Mining (EDM), encompassing all levels of education. Although there is extensive literature on student performance within EDM, studies specifically focused on early prediction are limited and mostly rely on traditional machine learning methods. However, in recent years, the importance and use of deep learning (DL) methods have increased due to their ability to process large datasets. This systematic literature review focuses on the early prediction of student performance using DL techniques. A total of 39 articles selected from the Scopus and Web of Science databases were analyzed using systematic and bibliometric methods. The review addresses five key research questions, including the distribution of studies by publication year, type, and education level; the datasets and features used; DL models and techniques; the timing of early predictions; and the challenges, limitations, and opportunities encountered. The bibliometric analysis, conducted with the VOSviewer program, visualized relationships between keywords, authors, and articles. Overall, this review provides a comprehensive synthesis of existing research on the early prediction of student academic performance using DL, offering valuable insights into trends and opportunities for researchers, educators, and policymakers.
- Research Article
- 10.1038/s41598-025-16311-y
- Aug 20, 2025
- Scientific Reports
Student performance prediction (SPP) constitutes one of the pivotal tasks in educational data analysis. Outcomes from the prediction enables educators to implement targeted interventions for students. Therefore, developing an effective SPP model is of critical importance. The belief rule base (BRB) is a rule-based modeling approach that integrates expert knowledge and effectively manages uncertain information. Nevertheless, when employing traditional BRB to construct a prediction model, excessive input attributes and reference points may result in a combination explosion. Furthermore, in practical scenarios, the configuration of the model’s parameters may be restricted by the limitations of expert knowledge. To overcome these challenges, an SPP model using an interval BRB structure based on the random forest (RF) attribute selection method (IBRB-C) is proposed. The parameters of the IBRB-C model are determined by combining the expert knowledge and the Kmeans++ algorithm. Subsequently, the P-CMA-ES algorithm is applied to optimize the initial model. Ablation experiment is conducted to validate the rationality of the IBRB-C. Finally, case studies on graduate applications and GPA of students demonstrate that the mean squared error (MSE) of the IBRB-C is 0.0024 and 0.1014, respectively. The results of comparative experiments confirm the superiority of the IBRB-C model in predicting student performance.
- Research Article
15
- 10.1080/08839514.2018.1508839
- Sep 25, 2018
- Applied Artificial Intelligence
Problem: Online higher education (OHE) failure rates reach 40% worldwide. Prediction of student performance at early stages of the course calendar has been proposed as strategy to prevent student failure. Objective: To investigate the application of genetic programming (GP) to predict the final grades (FGs) of online students using grades from an early stage of the course as the independent variable Method: Data were obtained from the learning management system; we performed statistical analyses over FGs as dependent variable and 11 independent variables; two statistical and one GP models were generated; the prediction accuracies of the models were compared by means of a statistical test. Results: GP model was better than statistical models with confidence levels of 90% and 99% for the training testing data sets respectively. These results suggest that GP could be implemented for supporting decision making process in OHE for early student failure prediction.
- Research Article
1
- 10.1002/cpe.7102
- May 30, 2022
- Concurrency and Computation: Practice and Experience
The performance of student in the academic field reveals the consideration over researchers to enhance student's weakness. With the consumption of high potential factors from the dataset, accurate student performance prediction is carried out. Targeted projection pursuit similarity based attribute selection (TPPS‐AS) technique is designed to improve the student academic performance prediction. TPPS is a machine learning technique that observes the given input. The relevant attributes from the multidimensional space is determined by TPPS. Several research works were recognized recently to conclude the high potential factors for observing student academic performances. A novel technique is designed in this research work to improve the student academic performance prediction in a taken dataset by choosing more relevant attributes. To detect the student academic performance with better accuracy and lesser time, proposed TPPS‐AS technique is employed. The performance of student academic performance prediction is improved by TPPS‐AS technique through the attribute selection with higher accuracy. With this proposed technique, prediction accuracy of the student academic performance is increased after the relevant attributes selection process.
- Research Article
79
- 10.1108/jarhe-09-2017-0113
- Dec 21, 2017
- Journal of Applied Research in Higher Education
PurposeThe purpose of this paper is to empirically investigate and compare the use of multiple data sources, different classifiers and ensembles of classifiers technique in predicting student academic performance. The study will compare the performance and efficiency of ensemble techniques that make use of different combination of data sources with that of base classifiers with single data source.Design/methodology/approachUsing a quantitative research methodology, data samples of 141 learners enrolled in the University of the West of Scotland were extracted from the institution’s databases and also collected through survey questionnaire. The research focused on three data sources: student record system, learning management system and survey, and also used three state-of-art data mining classifiers, namely, decision tree, artificial neural network and support vector machine for the modeling. In addition, the ensembles of these base classifiers were used in the student performance prediction and the performances of the seven different models developed were compared using six different evaluation metrics.FindingsThe results show that the approach of using multiple data sources along with heterogeneous ensemble techniques is very efficient and accurate in prediction of student performance as well as help in proper identification of student at risk of attrition.Practical implicationsThe approach proposed in this study will help the educational administrators and policy makers working within educational sector in the development of new policies and curriculum on higher education that are relevant to student retention. In addition, the general implications of this research to practice is its ability to accurately help in early identification of students at risk of dropping out of HE from the combination of data sources so that necessary support and intervention can be provided.Originality/valueThe research empirically investigated and compared the performance accuracy and efficiency of single classifiers and ensemble of classifiers that make use of single and multiple data sources. The study has developed a novel hybrid model that can be used for predicting student performance that is high in accuracy and efficient in performance. Generally, this research study advances the understanding of the application of ensemble techniques to predicting student performance using learner data and has successfully addressed these fundamental questions: What combination of variables will accurately predict student academic performance? What is the potential of the use of stacking ensemble techniques in accurately predicting student academic performance?
- Research Article
78
- 10.1109/access.2020.3036572
- Jan 1, 2020
- IEEE Access
Understanding, modeling, and predicting student performance in higher education poses significant challenges concerning the design of accurate and robust diagnostic models. While numerous studies attempted to develop intelligent classifiers for anticipating student achievement, they overlooked the importance of identifying the key factors that lead to the achieved performance. Such identification is essential to empower program leaders to recognize the strengths and weaknesses of their academic programs, and thereby take the necessary corrective interventions to ameliorate student achievements. To this end, our paper contributes, firstly, a hybrid regression model that optimizes the prediction accuracy of student academic performance, measured as future grades in different courses, and, secondly, an optimized multi-label classifier that predicts the qualitative values for the influence of various factors associated with the obtained student performance. The prediction of student performance is produced by combining three dynamically weighted techniques, namely collaborative filtering, fuzzy set rules, and Lasso linear regression. However, the multi-label prediction of the influential factors is generated using an optimized self-organizing map. We empirically investigate and demonstrate the effectiveness of our entire approach on seven publicly available and varying datasets. The experimental results show considerable improvements compared to single baseline models (e.g. linear regression, matrix factorization), demonstrating the practicality of the proposed approach in pinpointing multiple factors impacting student performance. As future works, this research emphasizes the need to predict the student attainment of learning outcomes.
- Research Article
128
- 10.1007/s10639-020-10230-3
- Jul 1, 2020
- Education and Information Technologies
Student performance modelling is one of the challenging and popular research topics in educational data mining (EDM). Multiple factors influence the performance in non-linear ways; thus making this field more attractive to the researchers. The widespread availability of e ducational datasets further catalyse this interestingness, especially in online learning. Although several EDM surveys are available in the literature, we could find only a few specific surveys on student performance analysis and prediction. These specific surveys are limited in nature and primarily focus on studies that try to identify possible predictor or model student performance. However, the previous works do not address the temporal aspect of prediction. Moreover, we could not find any such specific survey which focuses only on classroom-based education. In this paper, we present a systematic review of EDM studies on student performance in classroom learning. It focuses on identifying the predictors, methods used for such identification, time and aim of prediction. It is significantly the first systematic survey of EDM studies that consider only classroom learning and focuses on the temporal aspect as well. This paper presents a review of 140 studies in this area. The meta-analysis indicates that the researchers achieve significant prediction efficiency during the tenure of the course. However, performance prediction before course commencement needs special attention.
- Book Chapter
26
- 10.1007/978-3-319-23781-7_21
- Jan 1, 2015
Students' performance prediction in distance higher education has been widely researched over the past decades. Machine learning techniques and especially supervised learning have been used in numerous studies to identify in time students that are possible to fail in final exams. The identification of in case failure as soon as possible, could lead the academic staff to develop learning strategies aiming to improve students' overall performance. In this paper, we investigate the effectiveness of semi-supervised techniques in predicting students' performance in distance higher education. Several experiments take place in our research comparing to the accuracy measures of familiar semi-supervised algorithms. As far as, we are aware various researches deal with students' performance prediction in distance learning by using machine learning techniques and especially supervised methods, but none of them investigate the effectiveness of semi-supervised algorithms. Our results confirm the advantage of semi-supervised methods and especially the satisfactory performance of Tri-Training algorithm.
- Research Article
2
- 10.22399/ijcesen.1524
- Apr 9, 2025
- International Journal of Computational and Experimental Science and Engineering
Education is a pillar of any individual to attain success in their life. Knowledge evaluate students’ performance which resulted with low accuracy and many algorithms not able to manage imbalanced dataset. This research utilized the ML algorithms, EDA development and learning makes everyone become educated person. Many universities and colleges lend graduate course of study for various disciplines, and students choose courses based on interest. At the same time many researches consider normal factors like, personal and academic features, experimented with many machine learning models and analysis and Hybrid algorithms for students’ performance prediction. Exploratory data analysis performed to identify the correlation between features and features which support the evaluation of student’s performance prediction. Based on the evidence from the EDA analysis this paper aims to provide a deep learning-based hybrid approach that consists of Deep Neural Network -Random Forest (DNN-RF), Deep Neural Network -Light GBM (DNN-Light GBM) algorithms to evaluate the students' performance prediction that capable of handling a wide range of datasets from small to enormous and improve the prediction accuracy. The results shows that the Deep Neural Network -Random Forest achieved an accuracy of 99.56%, precision of 97.82%, recall of 98.13%, f1 score of 98.95% and DNN-Light GBM attained an accuracy of 90.76%, 85.13%, 84.94%, 87.93%. while comparing to ML algorithms RF, Light GBM and DNN-Light GBM, DNN-RF is utmost effective algorithm for forecasting student performance.
- Research Article
- 10.28945/5456
- Jan 1, 2025
- Journal of Information Technology Education: Research
Aim/Purpose: The purpose of this study is to review and categorize current trends in student engagement and performance prediction using machine learning techniques during online learning in higher education. The goal is to gain a better understanding of student engagement prediction research that is important for current educational planning and development. However, implementing machine learning approaches in student engagement studies is still very limited. Background: The rise of online learning during and after COVID-19 has created new difficulties for students’ engagement and academic achievements. Lecturers’ manual monitoring and supporting of students are inadequate online, leading to disengagement and performance challenges that may be very difficult to notice. Machine learning has great potential to provide an accurate prognosis of students’ engagement and outcomes to make early interventions possible. Nevertheless, the current knowledge deficit is in the systematic presentation of trends and insights concerning the utilization of these approaches in higher education online learning, especially with a focus on student engagement research. This research fills a crucial void by explaining and analyzing current trends in machine learning-based prediction models to enhance the quality and efficiency of an online learning system. Methodology: This research examines the existing literature on the application of machine learning, which allows computers to learn from data and improve their performance for early identification of student engagement and academic performance in higher education during online learning. The PICOC protocol was implemented to guide the search process and define the relevant keywords aligned with the research questions. Based on the PRISMA framework, a structured approach is adopted to identify and select studies to screen and extract the relevant papers from the database. Meta-analysis was adopted in data analysis whereby studies are combined and evaluated to provide insights into machine learning techniques’ effectiveness in student engagement and academic performance research. Contribution: This paper aims to present the current trends in predicting student engagement and academic achievement by applying machine learning approaches with a focus on their relevance in the context of online learning. It defines challenges that emerge with an interpretation of the extent of student engagement, which include the absence of consensus on levels of student engagement that hampers the use of explainable artificial intelligence – approaches that make training of machine learning models more logical, understandable and easily interpretable by lecturers. The finding points to the fact that through the prediction models, lecturers are enabled to recognize disengaged students early and foster their needs towards learning, providing direction toward more customized and effective online learning. Findings: A total of 96 primary studies have been identified and included in this systematic review. It is important to highlight the relevance of classification machine learning methods that are implemented in 88.60% of papers, while clustering methods are only employed in 15.19% of studies. Furthermore, the review shows that most research focuses on student performance prediction (82.28%) compared to student engagement level prediction (12.66%). Besides, student engagement datasets are used in 92.14% of studies, emphasizing student engagement’s popularity in educational prediction research. Moreover, classification machine learning methods are more prevalent in educational prediction research. In contrast, classification methods for student engagement research are still limited due to challenges in constructing consistent engagement levels. Recommendations for Practitioners: Lecturers need to occasionally assess student engagement levels during online learning to identify students who are left out and take immediate planning and action to encourage the student to engage during online learning. The syllabus designer should observe the students’ engagement level during online learning to plan the course content that can attract and engage the students. Students’ engagement during online learning can ensure their academic success and prevent them from dropping out. Recommendation for Researchers: Researchers should focus on defining the consensus on differentiating student engagement levels and implementing more explainable AI to enhance the interpretability and transparency of student engagement level predictive models. Researchers should enhance educational predictive models’ explainability, transparency, and accuracy by addressing issues brought about by feature selection, resampling techniques, and hyperparameter tuning. Impact on Society: The study highlights the growing importance of understanding student engagement through digital footprints, which can support personalized learning experiences and provide better educational outcomes. The efficient predictive models on student engagement can improve the effectiveness of higher education systems, benefiting students and institutions. Future Research: The challenges of current computational methods need to be overcome, including the need for more consistent approaches in differentiating engagement levels and enhancing the explainability and accuracy of educational predictive models through better feature selection, resampling techniques, and hyperparameter tuning.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.