Abstract

Many educational institutions are using predictive models to leverage actionable insights using student data and drive student success. A common task has been predicting students at risk of dropping out for the necessary interventions to be made. However, issues of discrimination by these predictive models based on protected attributes of students have recently been raised. An important question that is constantly asked is: should the protected attributes be excluded from the learning analytics (LA) models in order to ensure fairness? In this work, we aimed at answering questions that if we exclude the protected attributes from the LA models, does the exclusion ensure fairness as it <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">supposedly</i> should? Does the exclusion affect the performance of the LA model? If so, why? We found answers to these questions and went further to explain why. We built machine learning models and performed empirical evaluations using a 3-year dropout data for a particular program in a large Australian university. We found that excluding or including the protected attributes had marginal effect on predictive performance and fairness. Perhaps not surprisingly, our findings suggest that the effect of including or excluding protected attributes is a function of how they relate with the prediction outcome. More specifically, if a protected attribute is correlated with the target label and proves to be an important feature, then their inclusion or exclusion would have effect on the performance and fairness and vice versa. Our findings provide insightful information that can be used by relevant stakeholders to make well-informed decisions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call