Abstract

Multi-view learning is a machine learning app0roach aiming to exploit the knowledge retrieved from data, represented by multiple feature subsets known as views. Co-training is considered the most representative form of multi-view learning, a very effective semi-supervised classification algorithm for building highly accurate and robust predictive models. Even though it has been implemented in various scientific fields, it has not adequately used in educational data mining and learning analytics, since the hypothesis about the existence of two feature views cannot be easily implemented. Some notable studies have emerged recently dealing with semi-supervised classification tasks, such as student performance or student dropout prediction, while semi-supervised regression is uncharted territory. Therefore, the present study attempts to implement a semi-regression algorithm for predicting the grades of undergraduate students in the final exams of a one-year online course, which exploits three independent and naturally formed feature views, since they are derived from different sources. Moreover, we examine a well-established framework for interpreting the acquired results regarding their contribution to the final outcome per student/instance. To this purpose, a plethora of experiments is conducted based on data offered by the Hellenic Open University and representative machine learning algorithms. The experimental results demonstrate that the early prognosis of students at risk of failure can be accurately achieved compared to supervised models, even for a small amount of initially collected data from the first two semesters. The robustness of the applying semi-supervised regression scheme along with supervised learners and the investigation of features’ reasoning could highly benefit the educational domain.

Highlights

  • Educational data mining (EDM) has emerged in the past two decades as a highly-growing research field concerning the development and implementation of machine learning (ML) methods for analyzing datasets coming from various educational environments [1]

  • The results indicated the efficiency of the supervised regression (SSR) algorithm compared to familiar regression methods, such as linear regression (LR), model trees (MTs), and random forests (RFs)

  • In order to systematically examine the efficiency of the extended COREG variant over the problem of early prognosis on student’s performance, various choices of instance-based selectors and different learning model for the case of the regressors were chosen

Read more

Summary

Introduction

Educational data mining (EDM) has emerged in the past two decades as a highly-growing research field concerning the development and implementation of machine learning (ML) methods for analyzing datasets coming from various educational environments [1]. Most of the EDM research is mainly focused on implementing supervised methods utilizing only labeled datasets To this end, a plethora of classification and regression techniques have successfully been applied for predicting various learning outcomes of students, such as dropout, attrition, failure, academic performance, and grades, to name a few. We implement a well-known semi-supervised regression algorithm that is based on multi-view learning, adopting several ML learners into its main kernel, tackling with the early prediction of undergraduate students’ final exam grades in a one-year distance learning course. We investigate the effectiveness of the separate SSR variants that are produced compared with their corresponding supervised performance on the examined EDM task In this sense, the proposed model may serve as an early alert tool with a view to providing appropriate interventions and support actions to low performers.

Interpretability in Machine Learning
Related Work
Dataset Description
Gathering
Proposed
Experimental Process and Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call