Abstract

Urgent intervention in learner forum posts have recently occupied a very important role in research in Massive Open Online Course (MOOC) environments. Intervening in time may make the difference between a learner dropping out or staying on a course. However, due to the typical extremely high learner-to-instructor ratio in MOOCs, it is very challenging – if not sometimes impossible - for the instructor to monitor all the existing posts and identify which need immediate intervention, to encourage retention. Current approaches are based on shallow machine learning and deep learning. Whilst deep learning methods have been shown to be most accurate in many domains, the exact architecture can be very domain-dependent. In spite of their sheer size and representation power, deep neural networks are known to perform better when a problem is divided into the right sub-problems. These sub-problems can be further assembled together, to answer to the original problem, in what we intuitively call a ‘plug & play’-like fashion, similarly to puzzles – via hybrid (deep) neural networks. Hence, in this paper, we address this problem by proposing a classification model for identifying when a given post needs intervention from an instructor, based on hybrid neural networks. We represent words using two different methods; word2vec: that capture the word's semantic and syntactic characteristics; and transformer model (BERT): which represents each word according to its context. Then we construct different architectures, integrating various deep neural networks (DNNs) -‘word-based’ or ‘word-character based’, as we expected that adding additional character-sequence information may increase performance. For word-based, we apply convolutional neural network (CNN) and/or different types of recurrent neural networks (RNN); in some scenarios we added attention. This is to present a comprehensive answer to the character-sequence question in particular, as well as to the urgency of intervention need prediction in MOOC forums, in general. Experimental results demonstrate that using BERT rather than word2vec as a word embedding enhances performance in different models (the optimal result is the CNN + LSTM + Attention model based on BERT at word-level). Interestingly, adding word-character input does not improve the performance, as it does for word2vec.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call