PurposeThis study focused on early warning in K-12 online education and aims to address the following research gaps: timing, model performance and understandability by completing the following analytic tasks: (1) determine the best prediction timing based on changes in course requirements; (2) compare early warning models with the correct evaluation indicators; (3) interpret a complex predictive model using interpretable AI techniques. Through holistic analyses, the case study shows that the semester can be divided into three stages with different course requirements. Students who failed to meet the course requirements of individual stages would be predicted as at-risk. Through interpretable AI techniques, the key predictors of individual stages were identified, and the factors causing a student to be predicted as at-risk were also revealed. In addition, multiple at-risk types were identified through the analyses.Design/methodology/approachThis study employed a variational autoencoder and time series segmentation to detect changes in learning behavioral patterns resulting from alterations in course requirements. The semester was divided into three stages. Subsequently, complex ensemble machine learning classifiers were employed for early warnings at each stage, enabling accurate prediction of students' learning performance. Interpretable AI techniques were employed to gain insight, identify characteristics of different at-risk types and suggest personalized interventions.FindingsThe case study analyzed 16,011 K-12 online school students. Major Findings include: (1) The semester was divided into three segments based on learning pattern transition points. XGBoost can identify the most at-risk students in each segment. Results indicate S2 is a relatively appropriate stage for early warning prediction and intervention. By the end of the second stage, the models can already identify over 75% of at-risk students. A student’s at-risk probability in previous stage has the most important influence on his performance in the next stage. If a student has high at-risk probability, he will have more possibility to be labeled as potential at-risk in the next stage, if he has low at-risk probability, he will have more possibility to be labeled as potential successful in the next stage, this reflects the influence between neighbor learning stages. We also found that, early warning can be most effectively conducted in the last week of each of these three stages. (2) From S1 to S3 stages, significant changes in the categories of key behaviors were observed, the key behavior at S1 stage was assignment viewing and submitting, at S2 stage was resource viewing and grade check, at S3 stage was assignment viewing and submitting, the discussion behavior became not important, while the testing behavior gradually became important, this reflected the changes in the course requirements and the learning strategy. It is evident that in specific stages of the semester, not all behaviors are better when more frequent; some behaviors, when excessive at the wrong time, can have a negative impact on predictive results. (3) Five types of at-risk students were found at each stage through interpretable AI analyses: high engaged at-risk, low engaged at-risk, testing at-risk, low interaction at-risk and un-persistent at-risk, which represented five types of learning patterns that conducted potential at-risk. With the progress of course, students adjusted learning strategy and might convert from one at-risk type to other at-risk type or successful type. The S2 stage was the most important stage, students had their last opportunity to reverse their at-risk trends at this stage. (4) Students who do not perform well in the early stages still have the opportunity to reverse the at-risk trend with effort in the later stages, emphasizing the importance of intervention at the end of the second stage. Similarly, students who perform well in the early stages can still face failure if they do not put in effort in the final stage. Low-engaged students tended to maintain their low-engaged status, resulting in failure at the end of the semester compared to other at-risk types, this was caused by that they had no learning activity to adjust their learning strategy.Research limitations/implicationsA follow-up study can focus on the development of personalized interventions and observe whether these interventions can influence potentially at-risk students' learning strategies, more effectively guiding students to align with course requirements and ultimately leading to improved performance.Originality/valueThis study is one of the few that focus on large-scale K-12 early warning prediction and aim to identify at-risk types via interpretable AI for personalized intervention. The authors found that the proposed method can effectively determine the best prediction timing, identify more at-risk students, and gain deep insight into students' learning processes. The authors confirm that the research in their work is original, and all the data given in the paper are real and authentic. The study has not been submitted to peer review and has not been accepted for publication in another journal.
Read full abstract