BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech

Pramod Mehra,Shashi Kant Verma

doi:10.1145/3517195

Abstract

Emotions, the building blocks of the human intellect, play a vital role in Artificial Intelligence (AI). For a robust AI-based machine, it is important that the machine understands human emotions. COVID-19 has introduced the world to no-touch intelligent systems. With an influx of users, it is critical to create devices that can communicate in a local dialect. A multilingual system is required in countries like India, which has a large population and a diverse range of languages. Given the importance of multilingual emotion recognition, this research introduces BERIS, an Indian language emotion detection system. From the Indian sound recording, BERIS estimates both acoustic and textual characteristics. To extract the textual features, we used Multilingual Bidirectional Encoder Representations from Transformers. For acoustics, BERIS computes the Mel Frequency Cepstral Coefficients and Linear Prediction coefficients, and Pitch. The features extracted are merged in a linear array. Since the dialogues are of varied lengths, the data are normalized to have arrays of equal length. Finally, we split the data into training and validated set to construct a predictive model. The model can predict emotions from the new input. On all the datasets presented, quantitative and qualitative evaluations show that the proposed algorithm outperforms state-of-the-art approaches.

Full Text