Abstract

The rise of artificial intelligence technology has promoted the development of human-computer interaction and other fields. In human-computer interaction, in order to enable the machine to accurately perceive and understand the user’s emotion in real time, thereby improving the service quality of the machine, user emotion recognition has been widely studied. In real life, because voice output not only is convenient, but also contains rich emotional information, human-computer interaction is mainly carried out in the form of voice. Speech carries a wealth of linguistic, paralinguistic, and nonlinguistic information that is essential for human-computer interaction. Understanding language information alone will not allow a computer to fully comprehend the speaker’s purpose. For computers to behave like humans, speech recognition systems must be able to process nonverbal information, such as the emotional state of the speaker. As a result, developing machine understanding of human emotions requires speech-based emotion recognition. This paper proposes an improved long short-term memory network (ILSTM) for emotion recognition. Because the initial LSTM only analyzes the preceding moment’s input, it will miss out on a lot of information for the full context scene. In this way, all the features in the speech segment can be extracted. In order to be able to select the feature that can express emotion the most among the many features, this paper also introduces the attention mechanism. Experiments are carried out on public datasets, and the experimental results show that the ILSTM used in this paper is very effective in classifying speech emotion data and the classification accuracy can reach more than 0.6. This fully shows that this research can be applied to actual products and has certain feasibility and reference value.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call