Bimodal Emotion Recognition using Machine Learning

Manisha S*,H Saida Nafisa,Roshni P Anand,Nandita Gopal

doi:10.35940/ijeat.d2451.0410421

Abstract

The predominant communication channel to convey relevant and high impact information is the emotions that is embedded on our communications. Researchers have tried to exploit these emotions in recent years for human robot interactions (HRI) and human computer interactions (HCI). Emotion recognition through speech or through facial expression is termed as single mode emotion recognition. The rate of accuracy of these single mode emotion recognitions are improved using the proposed bimodal method by combining the modalities of speech and facing and recognition of emotions using a Convolutional Neural Network (CNN) model. In this paper, the proposed bimodal emotion recognition system, contains three major parts such as processing of audio, processing of video and fusion of data for detecting the emotion of a person. The fusion of visual information and audio data obtained from two different channels enhances the emotion recognition rate by providing the complementary data. The proposed method aims to classify 7 basic emotions (anger, disgust, fear, happy, neutral, sad, surprise) from an input video. We take audio and image frame from the video input to predict the final emotion of a person. The dataset used is an audio-visual dataset uniquely suited for the study of multi-modal emotion expression and perception. Dataset used here is RAVDESS dataset which contains audio-visual dataset, visual dataset and audio dataset. For bimodal emotion detection the audio-visual dataset is used.

Highlights

We present you the evaluation and comparison of other experimented models like Random forest, Decision tree, Convolutional Neural Network (CNN) for speech and VGG, CNN, Manuscript received on April 12, 2021
The kernel size used in this experimentation is of size 3x3 and Rectified Linear Unit (ReLU) is used as the activation function. – Batch Normalization- To normalize the input in a scale of o to 1 values, the batch normalization operation is performed on inputs that are given to the layer to avoid the values scattered all over the place. – MaxPooling2D – In this model built the function uses a pooling window of size 2x2 with 2x2 strides to perform the pooling operation on the data. – Softmax – This function normalizes K real numbers taken from the input vector into a probability distribution consisting of K probabilities
This paper presented a bimodal emotion recognition system that uses information from the channels audio and visual data obtained from a video stream

Summary

INTRODUCTION

Emotions are a language independent means of communication universally that are expressed non verbally. Speech emotion recognition system is based on CNN model [15] It recognizes emotion using the feature extraction. A. Koduru et al [12] proposed a speech emotion recognition model that extracts the features and selects the required region of interest and classifies the emotion. The main focus of this work was to use different feature extraction algorithm to improve the speech emotion recognition rate. Zhou et al [13] uses both spectral and prosodic features for recognizing the emotion through speech input Both the spectral and the prosodic features contain emotion information, and combining of these spectral features and prosodic features will improve the performance of the emotion recognition system. The proposed method uses the short time log frequency power coefficients (LFPC) to represent the speech signals and a discrete hidden Markov model (HMM) for classification of emotions. Results suggests that the average accuracy of emotion classification is 78%

DATASET

Image and Audio extraction from Video

Facial Emotion Recognition

Speech Emotion Recognition

Bimodal Integration using Fusion Rule

RESULTS AND DISCUSSIONS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Engineering and Advanced Technology	Publication Date: Apr 30, 2021
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Bimodal Emotion Recognition using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering and Advanced Technology

Lead the way for us

Similar Papers

Recognition and Perception of Emotions in Juvenile Myoclonic Epilepsy.
Lucas Johannes Rainer ... Julia Höfler
Epilepsia | VOL. 64
Lucas Johannes Rainer, et. al.Lucas Johannes Rainer ... Julia Höfler
18 Oct 2023
Epilepsia | VOL. 64

Sleep Deprivation and Emotion Recognition
Carmen M Schroder
Sleep | VOL. 33
Carmen M SchroderCarmen M Schroder
01 Mar 2010
Sleep | VOL. 33

Facial and bodily emotion recognition in multiple sclerosis: the role of alexithymia and other characteristics of the disease.
Cinzia Cecchetto ... Daniela Cutuli
Journal of the International Neuropsychological Society | VOL. 20
Cinzia Cecchetto, et. al.Cinzia Cecchetto ... Daniela Cutuli
01 Nov 2014
Journal of the International Neuropsychological Society | VOL. 20

Leveraging a Hybrid Deep Learning Architecture for Efficient Emotion Recognition in Audio Processing
Et Al Kirti Sharma
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11
Et Al Kirti SharmaEt Al Kirti Sharma
02 Nov 2023
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bimodal Emotion Recognition using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering and Advanced Technology