Abstract

Online course discussions contain abundant cognitive information from learners. Previous models required a large amount of labeled data to classify cognitive engagement from the perspective of semantic features alone. However, these models only contain semantic features but cannot fully represent textual information and have poor performance in cases of scarce labeled data. Moreover, cognitive psychological features imply important information that cannot be captured by semantic features. Therefore, this paper proposes a dual feature embedding-based semi-supervised cognitive classification method that exploits the additional inductive biases caused by implicit cognitive features to supplement generic semantic features. Additional inductive biases facilitate the propagation of labeled and unlabeled data and improve the consistency between unlabeled and augmented data. Unsupervised data augmentation (UDA) is used to obtain augmented data by inserting advanced noise into unlabeled data in semi-supervised learning. Furthermore, bidirectional encoder representations from transformers (BERT) are used to extract generic semantics, and linguistic inquiry and word count (LIWC) are adopted to fetch implicit cognitive features from discussion texts. Therefore, we refer to the proposed method as B-LIWC-UDA, sequentially fusing the dual features in the explicit and hidden levels to obtain dual feature embeddings. The cognitive engagement classification model was trained using supervised and consistent training methods. We conducted experiments using datasets obtained from two real-world online course discussions. The experimental results demonstrate that, in terms of major evaluation metrics, the proposed B-LIWC-UDA method performs better than state-of-the-art text classification methods used for identifying cognitive engagement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call