Abstract

In the context of higher education, the wide availability of data gathered by universities for administrative purposes or for recording the evolution of students’ learning processes makes novel data mining techniques particularly useful to tackle critical issues. In Italy, current academic regulations allow students to customize the chronological sequence of courses they have to attend to obtain the final degree. This leads to a variety of sequences of exams, with an average time taken to obtain the degree that may significantly differ from the time established by law. In this contribution, we propose a mixture hidden Markov model to classify students into groups that are homogenous in terms of university paths, with the aim of detecting bottlenecks in the academic career and improving students’ performance.

Highlights

  • In the context of higher education, the wide availability of administrative data has significantly grown in the last decade, making learning analytics techniques useful to face critical issues, and providing insights that can benefit students, teacher staff, and policy makers

  • The model we propose is composed of three parts: (i) an Hidden Markov (HM) sub-model to account for the serial dependence as well as the time-varying unobserved heterogeneity, (ii) a latent class sub-model to deal with the timeconstant unobserved heterogeneity, and (iii) a multinomial logit sub-model to deal with the dependence of latent class membership on individual characteristics

  • We first illustrate and discuss results related to the HM model (Section 4.2) and, we extend the analysis to the Mixture HM (MHM) with covariates (Sections 4.3 and 4.4)

Read more

Summary

Introduction

In the context of higher education, the wide availability of administrative data has significantly grown in the last decade, making learning analytics techniques useful to face critical issues, and providing insights that can benefit students, teacher staff, and policy makers. As opposed to the optimal matching, model-based approaches embed the analysis of sequence data in a probabilistic framework with benefits in terms of the generalisability of results In such a context, latent variable models for longitudinal data represent a wide class of models [14,15], which provides several alternatives to properly take into account the main characteristics of sequence data, that is, (i) serial dependence or autocorrelation (i.e., the correlation between the responses of the same individual), (ii) unobserved heterogeneity in the individuals (i.e., variability in the data that is due to unobservable individual characteristics), and (iii) the dependence of the observed data on covariates.

Data Description
Mixture Hidden Markov Models for Sequence Data
Analysis of Student Paths
Model Specification
Hidden Markov Model
Mixture Hidden Markov Model
Effect of Concomitant Variables
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.