Predicting student performance in computing courses based on programming behavior

Ignacio Araya,Victor Beas,Héctor Allende‐Cid,Katrina Stamulis

doi:10.1002/cae.22519

Abstract

AbstractIn the context of programming language courses, offering early and personalized assistance to students with difficulties in solving code exercises is fundamental to helping them improve their performance. To this end, it is desirable to detect those students with behavioral patterns that could potentially result in poor performance in a timely manner. In this study, we analyze different behavioral metrics for predicting the performance of students in a 2nd‐year computer science course. Metrics are extracted from practical sessions, that is, code exercise labs that may be finished online during a period of 5–7 days. In addition to well‐known metrics (e.g., the relative time spent solving compilation/execution errors or skills for removing compilation errors), we also incorporate metrics that take into account the work habits of students (e.g., timestamps in which the student has undertaken 25%, 50%, and 75% of the exercise lab) and similarity between codes. With a sample of 224 students from seven different course groups, we construct robust predictors by training different machine learning models, reaching an R2 value of up to 0.40. We also perform a machine learning‐based analysis for detecting the most relevant response variables and analyze their relations to the final grades of students. The results show that the newly considered metrics have enough predictive power to be considered in the model. Finally, we graphically illustrate the nonlinear and seemingly interdependent relations that exist between predictor and response variables.

Full Text