MOOC-BERT: Automatically Identifying Learner Cognitive Presence From MOOC Discussion Data

Zhi Liu,Hao Chen,Sannyuya Liu,Xi Kong,Zongkai Yang

doi:10.1109/tlt.2023.3240715

Abstract

In a MOOC learning environment, it is essential to understand students' social knowledge constructs and critical thinking for instructors to design intervention strategies. The development of social knowledge constructs and critical thinking can be represented by cognitive presence, which is a primary component of the Community of Inquiry (CoI) model. However, identifying learners' cognitive presence is a challenging problem, and most researchers have performed this task using traditional machine learning methods that require both manual feature construction and adequate labeled data. In this paper, we present a novel variant of the Bidirectional Encoder Representations from Transformers (BERT) model for cognitive presence identification, namely MOOC-BERT, which is pre-trained on large-scale unlabeled discussion data collected from various MOOCs involving different disciplines. MOOC-BERT learned deep representations of unlabeled data and adopted Chinese characters as inputs without any feature engineering. The experimental results showed that MOOC-BERT outperformed the representative machine learning algorithms and deep learning models in the performance of identification and cross-course generalization. Then, MOOC-BERT was adopted to identify the unlabeled posts of the two courses. The empirical analysis results revealed the evolution and differences in MOOC learners' cognitive presence levels. These findings provide valuable insights into the effectiveness of pre-training on large-scale and multidiscipline discussion data in facilitating accurate cognitive presence identification, demonstrating the practical value of MOOC-BERT in learning analytics.

Full Text