Related Topics
Articles published on Educational Data Mining
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
1337 Search results
Sort by Recency
- New
- Research Article
- 10.1142/s0218213026500132
- Apr 22, 2026
- International Journal on Artificial Intelligence Tools
- Ji Hongzheng
Predicting student academic performance has become increasingly vital in the field of educational data mining, as institutions seek data-driven strategies to enhance learning outcomes. However, many existing models rely solely on behavioral indicators or static features, often overlooking the role of time and context in shaping learning behavior. This limitation reduces predictive accuracy and adaptability in academic environments. To address this challenge, this study introduces EduFuseNet, a hybrid deep learning framework that integrates behavioral and spatiotemporal data for accurate classification of student performance. The workflow begins with data collection from a Student Academic Performance dataset, comprising both behavioral metrics and spatiotemporal information. The raw data undergoes preprocessing, including missing value imputation, one-hot encoding of categorical variables, and min-max scaling of numerical features. The processed data is then passed through two specialized branches: a Tabular Neural Structure-Aware (TabNSA) module that captures complex interdependencies within behavioral data, and a Spatiotemporal Transformer module that models temporal and sequential patterns in learning activities. The feature embeddings from both branches are fused and passed through fully connected layers to generate predictions across five academic performance bands, enabling precise classification and early risk identification. EduFuseNet achieved an accuracy of 99.00%, with a precision of 99.04%, recall of 99.00%, and F1-score of 99.01%, reflecting strong and reliable predictive performance. By leveraging both behavioral and temporal learning indicators, the model serves as an effective tool for early academic monitoring and intervention.
- Research Article
- 10.62762/tedm.2026.988161
- Apr 11, 2026
- ICCK Transactions on Educational Data Mining
- Farshid Keivanian + 1 more
Educational Data Mining (EDM) has achieved substantial gains in predictive performance, yet many existing approaches remain centered on single-objective optimization, most often accuracy. This does not adequately reflect the multi-dimensional nature of real-world educational decision-making, which requires balancing interpretability, fairness, robustness, efficiency, and timeliness. This perspective advocates a shift toward multi-objective, interpretable, and trustworthy EDM frameworks. We highlight the role of multi-objective optimization in modeling trade-offs through Pareto-optimal solutions and address the challenge of actionable decision-making through bargaining-based mechanisms, such as Nash bargaining, to select balanced and transparent outcomes. In addition, we discuss the value of fuzzy logic and adaptive methods for handling uncertainty and supporting interpretable reasoning in dynamic learning environments. Finally, we emphasize the importance of governance, accountability, and rigorous evaluation, and argue that emerging technologies should be assessed not only by performance gains but also by their practical and educational relevance. Overall, this perspective outlines a human-centered research agenda for the development of trustworthy, interpretable, and context-aware EDM systems.
- Research Article
- 10.70593/deepsci.0202045
- Apr 5, 2026
- International Journal of Applied Resilience and Sustainability
- Taibat Bolarinwa
The adoption of Artificial Intelligence (AI) in educational fields has developed a great potential and problem related to the academic progress, critical thinking, mental abilities, and final student results. The increased application of generative AI, intelligent tutoring machines, adaptive learning systems, predictive analytics, and AI-assisted learning tools have altered the traditional learning environment, although issues have been raised about over-reliance on technology, lower-level thinking, algorithm biases and academic dishonesty. This literature review was a systematic investigation of recent articles relevant to the topic of Artificial Intelligence in Education (AIEd) and its effects on student engagement, academic achievement, cognitive growth, and learning customization. The focus was on emerging trends as ChatGPT in education, machine learning in education, educational data mining, personalized feedback, and smart classrooms. The review established that AI-assisted learning and individualized instructional setting have a positive effect on knowledge retention, student motivation, self-regulated learning, digital literacy, and problem-solving abilities. Learning analytics and adaptive learning systems enhance the quality of academic outcomes by providing personalized learning channels and real-time feedback. The results suggest that over-dependence on AI tools can negatively affect critical thinking, creativity, metacognition, and independent reasoning in case the pedagogy does not carefully incorporate it. The concern increasing on algorithmic bias, cognitive load, ethical issues, and the impact of large language models on academic integrity were also identified as a major concern in the review.
- Research Article
- 10.3390/data11040075
- Apr 3, 2026
- Data
- Erika María López-López + 2 more
Student attrition remains a persistent challenge in higher education and is shaped by interacting socioeconomic, academic, institutional, and wellbeing-related mechanisms. Although learning analytics and educational data mining increasingly support early-warning and intervention workflows, dataset reuse is often limited by incomplete documentation and inconsistent variable definitions. This Data Descriptor presents a structured cross-sectional survey dataset on factors influencing student persistence at a Colombian public university campus (La Paz). Data were collected between August and December 2025 through an online questionnaire and subsequently cleaned to remove duplicate entries and personally identifiable information. The released dataset contains 333 student records and 33 variables covering demographics (e.g., age, gender, first-generation status), socioeconomic conditions (e.g., residential stratum, housing, financial aid), academic experience and satisfaction (multiple 1–5 Likert items), perceived dropout intention across personal/socioeconomic/academic domains, thematically coded open-ended items describing challenges and motives, and a self-allocation of 0–100 weights across three dropout-factor domains. We provide a machine-readable codebook, a transparent preprocessing description, and technical validation checks (value ranges, category consistency, and composite-score integrity). The dataset is intended to support reproducible retention research, equity-oriented analyses, and benchmarking of predictive models, while encouraging responsible reuse through privacy-preserving release practices and FAIR-aligned metadata, repository deposition, and versioning.
- Research Article
- 10.37134/jsml.vol14.2.2.2026
- Apr 1, 2026
- Journal of Science and Mathematics Letters
- Nurulhuda Ramli
Learning Management Systems (LMS) have become integral tools in higher education, generating vast amounts of data that can be leveraged to analyze and enhance academic performance. Despite the abundance of this data, effectively harnessing it to understand complex relationships between learning activities and student outcomes remains a challenge. This paper explores the application of Bayesian Network (BN), a powerful technique in Educational Data Mining (EDM) to model and predict student outcomes using LMS data. BN provides a probabilistic framework to explore how various learning analytics variables influence academic success. Using LMS data from an online undergraduate Mathematics course, the model investigated the impact of student engagement, resource utilization, and participation on exam grades. The results show that consistent attendance (88%), active participation in lecturing sessions (85%), and involvement in online mathematical laboratory activities (62%), despite lower engagement in other areas such as assessments and gamification, are strongly associated with favourable final exam outcomes (62% achieving ‘Good’ or ‘Excellent’ grades). Numerical simulations were conducted to explore future student outcomes by manipulating key variables, demonstrating the potential of improved learning strategies such as full participation, improved prior knowledge and complete utilization of digital resources. This study highlights the utility of BN in analyzing LMS data to inform educational practices and ultimately enhance academic performance in higher education.
- Research Article
- 10.22214/ijraset.2026.78840
- Mar 31, 2026
- International Journal for Research in Applied Science and Engineering Technology
- Dr P C Khanzode
Student academic performance prediction is a crucial topic in Educational Data Mining (EDM) and Learning Analytics, which can be used to help at-risk students by undertaking timely actions. The given paper is a systematic review of machine learning methods used in this field. It analyzes a range of approaches, starting with interpretable models such as Multiple Linear Regression and Decision Trees to ensemble and deep learning high-performance models such as Random Forest and Neural Networks. The review highlights the central role of feature engineering and is discussing predictors of academic and behavioral data, social-economic and psychological conditions. One of the broad implications of this paper is providing a comparative analysis of these methods with an emphasis on the continuing trade-off between predictive accuracy and model inter-pretability. Moreover, the disconnect between theory and real-world, full-stack deployment systems, which are more and more critical when it comes to actual usability, is also critically discussed in this review. Major gaps in the research, such as excessive use of synthetic data, lack of practical testing, and ethics, are determined. Lastly, the paper presents future directions which include the use of Explainable AI (XAI), federated learning in privacy and creation of real-time adaptive feedback systems
- Research Article
- 10.3390/informatics13040050
- Mar 27, 2026
- Informatics
- Yuri Reina Marín + 6 more
Student retention has become a major challenge for higher education institutions due to the influence that academic, socioeconomic, family, and motivational factors exert on students’ academic continuity. In this context, understanding the determinants that explain university persistence is essential for designing effective retention strategies. Based on the analysis of factors related to motivation, commitment, attitude, academic integration, and social and economic conditions, retention patterns were examined in a population of 532 university students, of whom 57.7% showed high retention, 38.2% medium retention, and 4.1% low retention. To identify the factors with the greatest influence on academic continuity, educational data mining techniques and supervised classification models were applied and evaluated using stratified 10-fold cross-validation. Tree-based ensemble models showed the most consistent predictive performance, with Random Forest achieving the best results (accuracy = 0.729 ± 0.058; F1-macro = 0.636 ± 0.136). Model interpretability was examined through SHAP analysis, which revealed that transportation conditions (0.249), task completion (0.170), absence of work obligations (0.168), and course completion (0.164) were the most influential predictors in the classification of retention levels. In addition, sensitivity analysis indicated that academic commitment accounts for 41.6% of the predictive impact, followed by motivation (23.5%). These findings demonstrate that student retention is shaped by the interaction of academic, motivational, and contextual factors and provide practical implications for the development of **early warning systems, personalized tutoring programs, psychosocial support initiatives, and financial assistance policies aimed at strengthening university retention.
- Research Article
- 10.1038/s41598-026-40502-w
- Mar 26, 2026
- Scientific reports
- Yongkang Duan + 1 more
Accurately predicting student dropout in Massive Open Online Courses (MOOCs) remains a critical challenge in educational data mining. While Spatio-Temporal Graph Neural Networks (STGNNs) have shown promise, established frameworks typically rely on first-order temporal dependencies, recursively deriving the current state solely from its immediate predecessor. We argue that such recursive compression fails to capture complex student behaviors, which are driven by the interplay between immediate short-term shocks and accumulated long-term patterns. To address this, we propose the Multi-Scale Spatio-Temporal Graph Network (MST-GCN). The core of our framework is a novel MST-RGCN layer featuring a Spatially-Conditioned Adaptive Gate. This mechanism dynamically modulates the fusion of short-term and long-term memories by explicitly conditioning on the evolving heterogeneous graph context. Comprehensive experiments on two large-scale benchmarks, KDD Cup 2015 and XuetangX, demonstrate that MST-GCN yields superior predictive performance compared to established baselines. Notably, our model exhibits remarkable robustness in unstructured, self-paced learning environments. Furthermore, qualitative analysis reveals that the model learns an interpretable policy: prioritizing long-term history to identify at-risk students while leveraging short-term momentum to predict successful learners. Our source code is publicly available at https://github.com/wudongze9/MST-GCN .
- Research Article
- 10.55041/isjem05836
- Mar 24, 2026
- International Scientific Journal of Engineering & Management
- Dr Satyam K + 1 more
Gathering and evaluating student input is essential to raising academic achievement and teaching quality in contemporary educational institutions. However, conventional feedback systems are frequently labour-intensive, manual, and incapable of drawing significant conclusions from massive amounts of data. This research proposes an intelligent academic feedback analysis system that combines ensemble machine learning and deep learning methods to address these issues. Students' textual feedback is processed by the suggested system, which also preprocesses the data and uses feature extraction techniques to transform unstructured data into a format that can be analysed. To improve forecast accuracy and robustness, ensemble techniques are used with deep learning models, such as neural networks. Institutions can make data-driven decisions thanks to the system's ability to automatically classify input into several categories and spot sentiment patterns. According to experimental findings, the hybrid model performs more accurately and efficiently than conventional machine learning techniques. In the end, this method improves educational results by lowering human labour and offering a scalable solution for real-time academic feedback evaluation. Keywords:Academic Feedback Analysis, Deep Learning, Ensemble Learning, Natural Language Processing, Sentiment Analysis,Educational Data Mining, Text Classification, Machine Learning
- Research Article
- 10.66104/374y8m47
- Mar 24, 2026
- Journal International Review of Research Studies
- Joelson Lopes Da Paixão
The incorporation of artificial intelligence (AI) into educational systems has expanded the possibilities for monitoring learning, producing feedback, and personalizing instruction. In school assessment, however, the adoption of these technologies requires more precise conceptual distinctions between different AI paradigms and a critical analysis of their pedagogical and ethical effects. This study aims to analyze the impacts of AI on school assessment processes by examining its formative potential, epistemological limits, and the ethical challenges involved in its use. Methodologically, this is a qualitative bibliographic study with an analytical and interpretive orientation. Rather than presenting itself as an exhaustive systematic review, the study explicitly adopts the format of an analytical bibliographic review, organized around academic literature and institutional documents relevant to the topic. Theanalysis is guided by the articulation between formative assessment theory and a critical sociotechnical reading of educational datafication. The study shows that the effects of AI on assessment are not homogeneous: rule-based systems tend to operate better in structured tasks; models supported by learning analytics and educational data mining expand monitoring and diagnostic capacity; and generative systems open new possibilities for open-ended tasks, but still show instability, opacity, and a persistent need for human oversight. The article concludes that AI can contribute to more continuous, responsive, and formative assessment practices, provided that its use remains subordinated to teachers' pedagogical judgment, data protection, algorithmic transparency, and principles of equity.
- Research Article
- 10.54254/2755-2721/2026.as32265
- Mar 16, 2026
- Applied and Computational Engineering
- Yaoyang Huang
The availability of learning activity data from online education platforms has created new opportunities to examine how student behaviors relate to academic outcomes. Within this context, educational data mining has been widely applied to analyze learning patterns and support performance prediction. This paper explores whether students' learning behaviors can be used to predict exam success in a Python learning environment. Exploratory data analysis is used to compare behavioral characteristics between students who passed and those who failed the exam. The prediction task is formulated as a binary classification problem using the variable passed exam. A support vector machine (SVM) classifier is applied to distinguish between pass and fail outcomes, and feature importance analysis is conducted to better understand the contribution of different learning behaviors. The results suggest that engagement-related variables, particularly study time and practice activities, are closely associated with exam success, while demographic features contribute relatively little to prediction performance. These findings are consistent with existing educational data mining research and demonstrate the value of machine learning methods for analyzing learning behavior data.
- Research Article
- 10.54097/08aaz166
- Mar 15, 2026
- Mathematical Modeling and Algorithm Application
- Chuyi Qu
The application of the Internet in the field of education and teaching is increasingly widespread, and there are massive educational data generated in this process. How to make reasonable use of these massive educational data has always been an important issue in the field of educational data mining. A student's Grade Point Average (GPA) is crucial for evaluating their own development, helping teachers plan teaching, and enabling schools to formulate education programs. Although there have been many precedents of using machine learning to predict students' GPA, the fitting process is often relatively simple. The method adopted in this study, which comprehensively utilizes Stacking and Optuna. Tune the hyperparameters of the base models using Optuna to enhance their fitting capabilities, successfully leverages the advantages of both, the coefficient of determination (R²) reaches 0.88. And this way is more accurate than previous model construction methods, demonstrating the potential and prospects of this method in the field of regression fitting.
- Research Article
- 10.1080/03610918.2026.2644594
- Mar 14, 2026
- Communications in Statistics - Simulation and Computation
- Xi Chen + 2 more
The significance of educating the next generation in the comprehension basics of future technological and scientific innovations will force an extensive economic and social pattern that cannot be overstressed. Moreover, artificial intelligence (AI) is considered as one such advanced technology, particularly machine learning (ML) approaches. Even though learning analytics and educational data mining have experienced a rise in utilization and exploration, they are still tricky to accurately describe. Application of deep learning (DL) techniques for classifying student grades has been well-received by researchers. Therefore, key intention of this work is to suggest a new hybrid DL approach to classify the student grades in Spark framework. Primarily, the data is subjected to data splitting by employing deep embedded clustering with data augmentation (DEC-DA). Moreover, DEC-DA model is trained by Tyrannosaurus optimization algorithm (TOA). Subsequently, it is transferred to the pre-processing phase, which is performed by data cleaning and data scaling. Spearman’s rank correlation coefficient is employed for feature selection. Ultimately, the student grade classification is accomplished by ResNext–Xception, which is an integration of ResNext and Xception models. Moreover, experimental analysis is conducted for the proposed model by considering performance metrics, like accuracy, sensitivity and specificity, where presented approach reached utmost values of 0.959, 0.965, and 0.949.
- Research Article
- 10.1126/sciadv.adz2240
- Mar 13, 2026
- Science Advances
- Claire L Walsh + 22 more
We present the Human Organ Atlas (HOA), an open data repository making accessible multiscale three-dimensional imaging of human organs. The repository also provides software tools and training resources enabling worldwide access, sharing, and analysis of these datasets, facilitating further research and the continued expansion of the HOA. The images are generated using a synchrotron imaging technique, hierarchical phase-contrast tomography (HiP-CT), that uses the ESRF’s Extremely Brilliant Source, spanning whole-organ imaging at ~20 micrometers/voxel with local volumes of interest within intact organs imaged down to ~1 micrometer/voxel. This offers a comprehensive exploration of human anatomy, providing unparalleled insights into intricate structures and spatial relationships. The HOA offers researchers, clinicians, and educators a valuable resource for anatomical study, image analysis, medical education, and large-scale data mining.
- Research Article
- 10.55041/ijsrem57503
- Mar 11, 2026
- International Journal of Scientific Research in Engineering and Management
- Rohit R Wahane
ABSTRACT Educational institutions generate a large amount of student performance data every semester. Traditional result analysis methods are mostly manual and time-consuming, making it difficult for faculty members to identify patterns and improve academic performance. This paper proposes a Smart Result Analysis System using Machine Learning that automates the analysis of student results and predicts academic performance. The system uses machine learning algorithms to analyse student marks, identify weak and strong learners, and generate statistical reports. The proposed system helps teachers understand performance trends, improve teaching strategies, and support slow learners. Experimental results show that the system provides efficient and accurate result analysis compared to manual methods. Key Words: Machine Learning, Result Analysis, Educational Data Mining, Student Performance Prediction, Academic Analytics.
- Research Article
- 10.3390/aieduc2010006
- Mar 9, 2026
- AI in Education
- Riyan Hasan + 1 more
Modeling the temporal dynamics of student learning is a central goal in educational data mining. Deep Knowledge Tracing (DKT) has emerged as a key approach, yet existing models are highly sensitive to out-of-distribution (OOD) inputs, such as those arising from curriculum changes, new assessment formats, or behavioral noise, which severely degrade predictive reliability. To address this challenge, we propose Energy-Based Out-of-Distribution Deep Knowledge Tracing (EB-OOD DKT), a unified framework that integrates energy-based uncertainty estimation and contrastive representation learning within a transformer-based DKT architecture. The model computes energy scores via the negative log-sum-exponential of prediction logits, serving as confidence indicators for detecting OOD inputs during inference. Additionally, an InfoNCE-based contrastive loss enhances representation robustness by aligning in-distribution samples and separating OOD cases in latent space. Temporal and behavioral context features, such as normalized response intervals and cumulative attempt counts, are incorporated to enrich cognitive-behavioral modeling. Experiments on four public educational datasets demonstrate consistent improvements in prediction accuracy and OOD detection. EB-OOD DKT provides a promising approach for more reliable student modeling across educational platforms with different content distributions.
- Research Article
- 10.21686/1818-4243-2026-1-37-45
- Mar 8, 2026
- Open Education
- Alla V Kalinichenko
The purpose of the study. The digitalization of education implies large-scale transformations encompassing the implementation of digital technologies at every level of general and vocational education, as well as additional education, as well as changes in the interactions of all participants in the educational process. One of the six key challenges addressed by the Education Development Strategy to 2036 is the rapid spread of digital technologies and artificial intelligence. The aim of this study is to determine the role of digital traces generated during the educational process as a tool for digital transformation in educational management, as well as to demonstrate the practical implementation of digital trace processing using the VKontakte social network API and data analysis methods (Educational Data Mining). Materials and Methods . The theoretical and methodological basis for the study was formed by the work of Russian and international researchers in the field of digital data analysis arising during the educational process. The study utilized data analysis methods, natural language processing techniques, and Python libraries such as pandas, numpy, mathplotlib, and others. The empirical portion of the study is based on the analysis of the digital footprint of the educational organization's communities on the social network VKontakte, represented as unstructured text. Results . Research shows that large volumes of heterogeneous digital trace data, including those presented in the form of semi-structured data, inevitably arise in the context of the digitalization of education and ensuring the information openness of educational organizations. This data is of interest for educational analytics used to address issues related to the digital transformation of the educational process, the digital transformation of educational management, and the continuity and integration of educational levels. The digital trace generated through interactions with the electronic information and educational environment and other digital resources of educational organizations on the internet (websites, social media pages, and instant messaging apps) opens up opportunities for analyzing data on the educational process and participants in educational relationships. However, systematic approaches to its analysis and use in the context of the digital transformation of education are required, including those that take into account legal requirements for personal data, ethical aspects, and security aspects. This article examines the prospects for analyzing digital data in educational organization communities on social media using data analysis and machine learning methods and presents a practical example of data analysis in such communities on the social network VKontakte using an API. Conclusion . The obtained results can be used both for initial studies of digital footprint analysis and as a basis for developing a system for generating educational analytics. Practical application of the results will facilitate the digital transformation of educational management.
- Research Article
- 10.1145/3798096
- Feb 26, 2026
- ACM Transactions on Knowledge Discovery from Data
- Valdemar Švábenský + 3 more
Background and context: Open datasets play a crucial role in three prominent research domains that intersect data science and education: learning analytics, educational data mining, and artificial intelligence in education. Researchers in these domains apply computational methods to analyze data from educational contexts, aiming to better understand and improve teaching and learning. Research scope and gap: Providing open datasets alongside research papers supports research reproducibility, fosters collaboration, and increases trust in research findings. It also provides individual benefits for authors, such as greater visibility, credibility, and citation potential. However, despite these advantages, the availability of open datasets and the associated practices within the learning analytics research communities, especially at their flagship conference venues, remain unclear. Goal and method: To address this gap, we conducted a systematic survey of publicly available datasets published alongside research papers in learning analytics domains. We manually examined 1,125 papers from three respected flagship conferences (LAK, EDM, and AIED) over the past five years (2020–2024). We discovered, categorized, and analyzed 172 unique datasets used in 204 publications. Results and contributions: Our study presents the most comprehensive collection and analysis of open educational datasets to date, along with the most detailed categorization. Of the 172 datasets identified, 143 were not captured in any prior survey of open data in learning analytics. We provide insights into the datasets’ context, analytical methods, use, and other properties. Based on this survey, we summarize the current gaps in the field. Furthermore, we list practical recommendations, advice, and 8-item guidelines under the acronym PRACTICE with a checklist to help researchers publish their data. Lastly, we share our original dataset: an annotated inventory detailing the discovered datasets and the corresponding publications. We hope these findings will support further adoption of open data practices in learning analytics communities and beyond.
- Research Article
- 10.69760/lumin.2026001007
- Feb 25, 2026
- Luminis Applied Science and Engineering
- Gerda Urbaite
Adaptive AI-driven learning systems personalize instruction by estimating learner state and dynamically selecting content, feedback, and pacing to improve mastery and engagement. This paper synthesizes peer-reviewed evidence on adaptive learning, intelligent tutoring, knowledge tracing, educational data mining, and recommender systems, and proposes an applied engineering framework suitable for deployment in higher-education STEM contexts. We ground personalization in classic student modeling (knowledge tracing) and modern sequence modeling (deep knowledge tracing), and integrate a multidimensional view of engagement to avoid reducing “engagement” to simple clickstream metrics. We then present a modular, service-oriented system architecture encompassing data ingestion, learner modeling, pedagogical decisioning, explainability, monitoring, and governance controls. A prototype evaluation is conducted using a simulation-based testbed (explicitly illustrative, not empirical) with synthetic learners and skills. Across 600 simulated learners and 25 skills over 120 learning steps, an adaptive policy improves average mastery (fraction of skills mastered at threshold) compared to non-adaptive paging and random sequencing, with markedly higher rates of reaching “80% mastery.” The results also show that naive optimization may widen outcome gaps across learner subgroups, motivating fairness-aware objectives and human-in-the-loop controls. Ethical, privacy, and accessibility requirements are addressed through risk management practices, differential privacy–compatible training options, transparent explanations, and WCAG-aligned interface design.
- Research Article
- 10.3390/inventions11020020
- Feb 24, 2026
- Inventions
- Menna M S Elmasry + 2 more
Because of the substantial class disparity and the intricate interactions between academic, behavioral, and socioeconomic characteristics, anticipating student academic performance and dropout rates continues to be a major issue for institutions of higher learning. To improve the dependability and credibility of multiclass student outcome prediction, this study suggests a strong, multi-objective, and uncertainty-aware predictive framework that combines the Random Forest (RF) classifier with Holistic Swarm Optimization (HSO). The suggested method creates a multi-objective optimization problem that simultaneously maximizes macro F1-score, controls model complexity, and lessens inter-class performance disparity. Thereby, the model promotes fairness across student outcome categories, in contrast to traditional optimization strategies that only concentrate on predictive accuracy. Furthermore, by utilizing ensemble-based probability dispersion, the framework integrates uncertainty-aware prediction, making it possible to identify high-risk students with different degrees of confidence to assist practical academic interventions. According to the results of experiments, the suggested HSO-RF framework greatly reduces the performance gap between outcome classes while achieving the best overall predictive performance, reaching an accuracy of 77.74%, a macro F1-score of 0.69, and a weighted F1-score of 0.76. The analysis shows that academic, socioeconomic, and administrative characteristics serve as significant markers of student motivation, stability, and vulnerability in addition to computational benefits. The suggested architecture advances appropriate and trustworthy educational data mining and offers a dependable decision-support tool for early warning systems.