Abstract

It is frequently simple to identify emotions in human-to-human encounters since they may be seen in speech, body language, and facial expressions. Nevertheless, recognising human emotion in human-computer interactions (HCI) can be difficult. Speech emotion recognition (SER), with the aim of just identifying emotions through vocal intonation, has arisen as a way to enhance this connection. In this paper, a SER system based on deep learning methodologies is proposed.Here, RAVDESS dataset was utilised to assess the suggested system. To select the most appropriate vocal features that represent speech emotions for this MFCC is used. LSTM, CNN, and a hybrid model that combines CNN and LSTM are three different deep learning models we used to construct our SER system. By examining these various strategies, We were able to establish the most effective model for properly recognising emotional states from speech data in real-time situations. Overall, this work indicates the usefulness of the suggested deep learning approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.