Exploring the diversified teaching mode of vocal singing in colleges and universities by integrating audio-visual and multi-sensory senses

Tao Long

doi:10.2478/amns.2023.2.01468

Abstract

Abstract In this paper, starting from the fusion of audio-visual multisensory vocal music feature representation, the RAE algorithm is used to represent the music lyrics text in vocal singing teaching. Due to the heterogeneity of the feature space of the audio modality and the lyrics modality, which makes it exceptionally tricky to directly mine the correlation between these two modalities, it is necessary to optimize the research through the fusion of the audio and the textual modality for the representation of the implicit space of the music. According to the SVM’s optimal classification, hyperplane is not limited by the data dimension. Combined with the kernel function parameters and emotional LFSM fusion, a visual multisensory vocal singing teaching emotion model based on SVM is constructed. The parameter environment and dataset are selected, and the comparison method and evaluation criteria are determined to analyze emotional research on vocal singing teaching in colleges and universities. The results show that in terms of model performance, the SVM model in this paper is 8.5% higher than model 6, reaching the highest 0.873, with stronger emotion extraction and recognition ability, greatly improving the emotion classification results of the model. The multimedia type of audio singing material is the least helpful for expression and physical performance in vocal singing teaching, with a total value of 67. This study provides a more comprehensive understanding of the emotional changes of students in teaching activities so as to identify problems, improve and optimize them, and give more students guidance and support in performing vocal singing.

Full Text