AbstractDuring the COVID‐19 pandemic, the challenges associated with the transition from face‐to‐face to emergency remote education increased concerns about student dropout. Aligned with this concern, this study investigates the impact of the pandemic on the dropout patterns of 3371 undergraduate students from a Brazilian institution. Using data mining and machine learning techniques, we developed predictive dropout models based on student data preceding and succeeding the onset of the pandemic. Through the interpretation and comparison of these models and with the support of statistical and graphical analyses, we identify that the patterns persistently indicate that young students in their initial semesters, characterized by lower income, academic performance, and interaction, remain most susceptible to dropping out. Despite the pandemic leading to an enhanced predictive capability of data regarding student interaction within the virtual learning environment, our analysis revealed a lack of significant variation in dropout patterns. Institutionally, this indicates that a considerable number of dropouts likely encountered challenges in adapting to higher education, both before and throughout the pandemic. Practitioner notesWhat is already known about this topic The challenges posed by emergency remote learning, implemented during the COVID‐19 pandemic, may exacerbate the dropout problem and change the patterns involved in this phenomenon. Despite being widely used to identify dropout profiles and/or predict at‐risk students, data mining and machine learning techniques have been little explored in the investigation of changes associated with the pandemic context. What this paper adds We employ data mining and machine learning techniques to construct predictive and interpretable dropout models for the pre‐ and during‐pandemic contexts of a Brazilian institution. Comparing these models, we investigate the impacts of the pandemic on dropout patterns. The pandemic and the shift to emergency remote learning have an enhanced predictive capability of data regarding student interaction within the virtual learning environment. Throughout the pandemic, there was limited variation observed in dropout patterns, consistently highlighting young students in their initial semesters with lower income, academic performance and levels of interaction. Implications for practice and/or policy This study urges the inclusion of interactional student data in future dropout prediction research, capitalizing on the enhanced predictive power attained through the widespread adoption of virtual learning environments. Institutionally, the dropout patterns from before and during the pandemic suggest that students may be facing difficulties in adapting to higher education. In addition to the need to intensify preventive actions, this work indicates the need to conduct a study specifically targeting first‐semester students to understand their needs better and redesign preventive policies.
Read full abstract