Abstract
It is critical for a computer to understand the speaker’s mood during a human–machine conversation. Until now, we’ve only used neutral phrases or utterances to train robots. A person’s mood affects their performance. Machines have a hard time deciphering human mood from voice because humans can make fourteen distinct sounds in a second. For a machine to comprehend human behavior, it must first comprehend the human ear’s acoustic skills. Linear Prediction Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC) can simulate the human auditory system. Emotion Recognition from Indian Languages (ERIL) extracts emotions like fear, anger, surprise, sadness, happiness, and neutral. ERIL first pre-processes the voice signal, extracts selective MFCC, LPC, pitch, and voice quality features, then classifies the speech using Catboost. We tested ERIL on different benchmark classifiers to choose Catboost. ERIL is a multilingual emotion classifier, it is independent of any language. We checked it on Hindi, Gujarati, Marathi, Punjabi, Bangla, Tamil, Oriya, Kannada, Assamese, and Telugu. We recorded a speech dataset of various emotions in these languages. The accuracy of distinct emotions is 95.05 percent on average. The languages have a combined average of 95.05082 percent.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.