Abstract

It is critical for a computer to understand the speaker’s mood during a human–machine conversation. Until now, we’ve only used neutral phrases or utterances to train robots. A person’s mood affects their performance. Machines have a hard time deciphering human mood from voice because humans can make fourteen distinct sounds in a second. For a machine to comprehend human behavior, it must first comprehend the human ear’s acoustic skills. Linear Prediction Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC) can simulate the human auditory system. Emotion Recognition from Indian Languages (ERIL) extracts emotions like fear, anger, surprise, sadness, happiness, and neutral. ERIL first pre-processes the voice signal, extracts selective MFCC, LPC, pitch, and voice quality features, then classifies the speech using Catboost. We tested ERIL on different benchmark classifiers to choose Catboost. ERIL is a multilingual emotion classifier, it is independent of any language. We checked it on Hindi, Gujarati, Marathi, Punjabi, Bangla, Tamil, Oriya, Kannada, Assamese, and Telugu. We recorded a speech dataset of various emotions in these languages. The accuracy of distinct emotions is 95.05 percent on average. The languages have a combined average of 95.05082 percent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call