Speech emotion recognition using both deep learning and machine learning techniques

Arise Roja Ganga Bhavani,Kotha Sita Kumari,Kantheti Leela,Medasani Pravalika

doi:10.1063/5.0117419

Abstract

When we are speaking with someone, the first thing we will notice about them is their face. We can easily guess their emotional state by looking at their face, such as which feeling they are experiencing. When we're talking indirectly, however, we can't make the same prediction. As a result, a human may predict the feelings of another human being to some extent. However, a machine such as a computer is incapable of doing so. When we handed up The emotional state of a human being cannot be predicted using an audio clip as input to a computer. As a result, we devised a technique dubbed "Speech Emotion Recognition"[SER] to address this issue. Different strategies can be used to address this voice emotion recognition challenge, but the accuracy is what matters. So we solved this difficulty by comparing the accuracies of two technologies. Here, The most efficient technique for voice emotion detection is the convolution neural network algorithm, which uses several modules for emotion recognition and classifiers to discriminate emotions like happy, surprise, anger, neutral mood, sadness, and so on. [IN-DEE LEARNING] Emotions play a vital role in everyone's lives. Everyone wants to know what other people are up to. As if we can say what the other person thinks when we're talking directly. I'm referring about the facial expressions as well as the tone of voice. However, when it comes to phone calls, we don't know and can't forecast what the person is thinking as an emotion, thus to fix this, we've turned to a new trend known as "Speech Emotion Recognition"[SER]. So, in order to tackle this problem, we used the Machine Learning Language, which includes an efficient algorithm known as SVM, as well as the database Ravdess[Machine Learning].

Full Text