College English teaching aims at students with different foundations and characteristics. Also, it is necessary to grasp the mental state of students in time during the teaching process. Based on the bimodal information of facial expressions and speech of classroom students, this paper designs a mental state assessment method based on deep learning. For bimodal emotion recognition of facial expressions and speech, a feature fusion method based on sparse canonical correlation analysis (SCCA) is proposed in this paper. First, the emotional features of the facial expression and speech are extracted, respectively. Then, SCCA is used to fuse the emotional features of the two modalities. Finally, the sparse representation-based classification (SRC) is used as the classifier for emotional prediction. Based on the prediction results, the mental state of different students can be grasped, so as to adjust the teaching strategy in a targeted manner. Experiments are carried out based on public datasets. First, the proposed method achieves the average classification accuracy of 92.4%, which is higher than those from the present methods for comparison. Second, under the condition of noise corruption, the proposed method keeps the superior robustness over the comparison methods. The results show that the proposed bimodal emotion recognition method based on SCCA and SRC can achieve higher recognition rates than some present methods.
Read full abstract