Objective: With the continuous development of online learning, the analysis of students' online learning has become increasingly important. Understanding which factors can influence students' engagement in online learning plays a crucial role in improving their learning performance. Methods: By utilizing web crawling techniques, students' online learning behavior data was collected from the Chinese University’s massive open online courses (MOOC) platform. To address the imbalance in the dataset, a synthetic minority oversampling technique (SMOTE) was used. Course progress was used to reflect students' online learning status, which was categorized into interruptions and completions. Furthermore, to tackle the issue of low computational efficiency in the C4.5 decision tree algorithm, its calculation formula was improved to develop an improved version of C4.5. Findings: Of the several factors analyzed, the number of course chapters had the greatest impact on students' online learning, followed by the number of course evaluations and overall course scores. The classification of students’ online learning situations based on an improved C4.5 algorithm revealed that the improved method achieved the highest accuracy rate of 0.942 and the shortest classification time of 0.165 s compared to methods such as the naive Bayesian and random forest algorithms. Novelty: This study designed an improved version of C4.5 to analyze the influencing factors in online learning, and its reliability was demonstrated through experiments, providing a new effective method for data analysis in online learning. Doi: 10.28991/HIJ-2024-05-02-018 Full Text: PDF
Read full abstract