Speech Emotion Recognition

Yash Dixit ,Surabhi ,Suraj Yadav ,Shivam Singh ,Sakshi Chauhan

doi:10.36948/ijfmr.2024.v06i03.21105

Abstract

Speech emotion recognition is a vital area of research with applications starting from human-computer interaction to mental health monitoring. This paper will provide a comprehensive survey of the techniques, methods, applications, and challenges in speech emotion recognition. It begins by checking the significance of recognizing emotions from speech and its diverse applications across various fields. The paper then shows the method employed for emotion speech recognition, encompassing traditional machine learning techniques, such as support vector machines and Gaussian mixture models, as well as contemporary approaches, including deep learning and multimodal fusion. Moreover, it examines benchmark datasets commonly used for training and evaluation purposes in emotion speech recognition research. Speech Emotion Recognition (SER) has a wide range of applications and there has been a lot of research going on in this fascinating area in recent years. However, the entertainment sector suffers from a lack of study in this research. Many of use Neural Network (NN) and Long Short-Term Memory (LSTM) architectures to categorize the emotions in audio recordings captured by actors expressing various emotions. Moreover, our survey explores real-world applications of emotion speech recognition like virtual assistants, health sector, market sector, education, Mental health diagnosis. The paper discusses the challenge associated with emotion speech recognition, including the variability of emotional expressions, cultural influences, humor and privacy concerns. In future it will help others to deal with many Noisy dataset and various cultural effects of emotion.

Full Text