Emotion is the important information that people transmit in the process of communication, and the change of emotional state affects people’s perception and decision-making, which introduces the emotional dimension into human-computer interaction. The modes of emotional expression include facial expressions, speech, posture, physiological signals, text, and so on. Emotion recognition is essentially a multimodal fusion problem. This paper investigates the different teaching modes of the teachers and students of our school, designs the load capacity through the K-means algorithm, builds a multimedia network sharing classroom, and creates a piano music situation to stimulate students’ learning interest, using audiovisual and other tools to mobilize students’ emotions, using multimedia guidance to extend students’ piano music knowledge, and comprehensively improve students’ aesthetic ability and autonomous learning ability. Comparing the changes of students after 3 months of teaching, the results of the study found that multimedia sharing classrooms can be up to 50% ahead of traditional teaching methods in enhancing students’ interest, and teachers’ acceptance of multimedia network sharing classrooms is also high.