Abstract
Abstract In this study, we enhanced the CaffeNet network for recognizing students’ facial expressions in a music classroom and extracted emotional features from their expressions. Additionally, students’ speech signals were processed through filters to identify emotional characteristics. Using the LRLR fusion strategy, these expression and speech-based emotional features were combined to derive multimodal fusion emotion results. Subsequently, a music teaching model incorporating this multimodal emotion recognition was developed. Our analysis indicates a mere 6.03% discrepancy between the model’s emotion recognition results and manual emotional assessments, underscoring its effectiveness. Implementation of this model in a music teaching context led to a noticeable increase in positive emotional responses—happy and surprised emotions peaked at 30.04% and 27.36%, respectively, during the fourth week. Furthermore, 70% of students displayed a positive learning status, demonstrating a significant boost in engagement and motivation for music learning. This approach markedly enhances student interest in learning and provides a solid basis for improving educational outcomes in music classes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.