Abstract

Poor academic performance of students is a concern in the educational sector, especially if it leads to students being unable to meet minimum course requirements. However, with timely prediction of students’ performance, educators can detect at-risk students, thereby enabling early interventions for supporting these students in overcoming their learning difficulties. However, the majority of studies have taken the approach of developing individual models that target a single course while developing prediction models. These models are tailored to specific attributes of each course amongst a very diverse set of possibilities. While this approach can yield accurate models in some instances, this strategy is associated with limitations. In many cases, overfitting can take place when course data is small or when new courses are devised. Additionally, maintaining a large suite of models per course is a significant overhead. This issue can be tackled by developing a generic and course-agnostic predictive model that captures more abstract patterns and is able to operate across all courses, irrespective of their differences. This study demonstrates how a generic predictive model can be developed that identifies at-risk students across a wide variety of courses. Experiments were conducted using a range of algorithms, with the generic model producing an effective accuracy. The findings showed that the CatBoost algorithm performed the best on our dataset across the F-measure, ROC (receiver operating characteristic) curve and AUC scores; therefore, it is an excellent candidate algorithm for providing solutions on this domain given its capabilities to seamlessly handle categorical and missing data, which is frequently a feature in educational datasets.

Highlights

  • In today’s competitive world, higher education (HE) institutions need to deliver efficient and quality education to retain their students

  • We can observe from the results that the F-measure scores range from 0.67 to 0.77, with CatBoost achieving the highest score

  • The overall accuracy of all the algorithms has ranged between 65% and 75%, with the result from CatBoost again clearly outperforming the other algorithms on these datasets

Read more

Summary

Introduction

In today’s competitive world, higher education (HE) institutions need to deliver efficient and quality education to retain their students. To understand and assist student learning, HE providers are implementing learning analytics (LA) systems; these systems provide data-driven insights about students to assist educators in determining their overall academic progression. This technology is being leveraged to predict at-risk students and identify their learning problems at early stages, with the purpose of initiating timely interventions and tailoring education [1,2,3] to each student’s level of need. LMSs are web-based learning systems that offer a virtual platform for facilitating teaching and learning, such as providing students with online course content, tracking student interactions, enabling peer communication over online forums, delivering course assessments to students, or releasing assessment grades [5].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call