Research on Speech Emotion Recognition System Using Machine Learning

Atharva Yerawar,Jawaharlal Darda Institute Of Engineering And Technology, Yavatmal, Maharashtra, India

doi:10.55041/ijsrem30141

Atharva Yerawar, Jawaharlal Darda Institute Of Engineering And Technology, Yavatmal, Maharashtra, India

Open Access

https://doi.org/10.55041/ijsrem30141

Copy DOI

Abstract

Speech Emotion Recognition (SER) remains a hot topic in the domain of affective computing, drawing considerable research interest. Its growing potential, advancements in algorithms, and real-world applications contribute to this interest. Human speech carries paralinguistic cues that can be quantitatively represented through features like pitch, intensity, and Mel-Frequency Cepstral Coefficients (MFCC). SER typically involves three main stages: data processing, feature extraction/selection, and classification based on emotional features present. These steps, tailored to the unique attributes of human speech, make machine learning methods a fitting choice for SER implementation. While recent studies in affective computing leverage various ML techniques for SER tasks, few delve into the techniques aiding these core steps. Moreover, the challenges within these stages and cutting-edge approaches addressing them often receive limited discussion or are overlooked in these works.This project introduces a pioneering Speech Emotion Recognition System leveraging BiLSTM Algorithm and Image Processing techniques, developed for implementation on the Raspberry Pi platform using Python. The system analyses recorded audio input, employing three LEDs for mood detection and interfacing with Raspberry Pi programming facilitated by an SD card. This abstract outlines an all-encompassing approach for real-time emotion recognition via audio analysis, catering to varied applications in emotional AI and human-computer interaction. KEYWORDS: Speech Emotion Recognition (SER), Machine Learning, Image Processing, Raspberry Pi.

Full Text