PurposeIn recent years, studies have shown that machine learning significantly improves student performance and retention and reduces the risk of student dropout and withdrawal. However, there is a lack of empirical research reviews focusing on the application of machine learning to predict student performance in terms of learning engagement and self-efficacy and exploring their relationships. Hence, this paper conducts a systematic research review on the application of machine learning in higher education from an empirical research perspective.Design/methodology/approachThis systematic review examines the application of machine learning (ML) in higher education, focusing on predicting student performance, engagement and self-efficacy. The review covers empirical studies from 2016 to 2024, utilizing a PRISMA framework to select 67 relevant articles from major databases.FindingsThe findings show that ML applications are widely researched and published in high-impact journals. The primary functions of ML in these studies include performance prediction, engagement analysis and self-efficacy assessment, employing various ML algorithms such as decision trees, random forests, support vector machines and neural networks. Ensemble learning algorithms generally outperform single algorithms regarding accuracy and other evaluation metrics. Common model evaluation metrics include accuracy, F1 score, recall and precision, with newer methods also being explored.Research limitations/implicationsFirst, empirical research literature was selected from only four renowned electronic journal databases, and the literature was limited to journal articles, with the latest review literature and conference papers published in the form of conference papers also excluded, which led to empirical research not obtaining the latest views of researchers in interdisciplinary fields. Second, this review focused mainly on the analysis of student grade prediction, learning engagement and self-efficacy and did not study students’ risk, dropout rates, retention rates or learning behaviors, which limited the scope of the literature review and the application field of machine learning algorithms. Finally, this article only conducted a systematic review of the application of machine learning algorithms in higher education and did not establish a metadata list or carry out metadata analysis.Originality/valueThe review highlights ML’s potential to enhance personalized education, early intervention and identifying at-risk students. Future research should improve prediction accuracy, explore new algorithms and address current study limitations, particularly the narrow focus on specific outcomes and lack of interdisciplinary perspectives.
Read full abstract