Emotional speaker identification using a novel capsule nets model

Ali Bou Nassif,Ismail Shahin,Ashraf Elnagar,Divya Velayudhan,Adi Alhudhaif,Kemal Polat

doi:10.1016/j.eswa.2021.116469

Abstract

Speaker recognition systems are widely used in various applications to identify a person by their voice; however, the high degree of variability in speech signals makes this a challenging task. Dealing with emotional variations is very difficult because emotions alter the voice characteristics of a person; thus, the acoustic features differ from those used to train models in a neutral environment. Therefore, speaker recognition models trained on neutral speech fail to correctly identify speakers under emotional stress. Although considerable advancements in speaker identification have been made using convolutional neural networks (CNN), CNNs cannot exploit the spatial association between low-level features. Inspired by the recent introduction of capsule networks (CapsNets), which are based on deep learning to overcome the inadequacy of CNNs in preserving the pose relationship between low-level features with their pooling technique, this study investigates the performance of using CapsNets in identifying speakers from emotional speech recordings. A CapsNet-based speaker identification model is proposed and evaluated using three distinct speech databases, i.e., the Emirati Speech Database, SUSAS Dataset, and RAVDESS (open-access). The proposed model is also compared to baseline systems. Experimental results demonstrate that the novel proposed CapsNet model trains faster and provides better results over current state-of-the-art schemes. The effect of the routing algorithm on speaker identification performance was also studied by varying the number of iterations, both with and without a decoder network.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Emotional speaker identification using a novel capsule nets model

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Jan 5, 2022
Citations: 25

Similar Papers

A Novel RBFNN-CNN Model for Speaker Identification in Stressful Talking Environments
Ali Bou Nassif ... Noha Alnazzawi
Applied Sciences | VOL. 12
Ali Bou Nassif, et. al.Ali Bou Nassif ... Noha Alnazzawi
11 May 2022
Applied Sciences | VOL. 12

Emotional Speaker Recognition based on Machine and Deep Learning
Tshephisho Joseph Sefara ... Tumisho Billson Mokgonyane
-
Tshephisho Joseph Sefara, et. al.Tshephisho Joseph Sefara ... Tumisho Billson Mokgonyane
25 Nov 2020
25 Nov 2020

Emotional Speaker Verification Using Novel Modified Capsule Neural Network
Ali Bou Nassif ... Ismail Shahin
Mathematics | VOL. 11
Ali Bou Nassif, et. al.Ali Bou Nassif ... Ismail Shahin
15 Jan 2023
Mathematics | VOL. 11

Speaker Recognition with VAD
Jian Ling ... Jianwei Zhu
-
Jian Ling, et. al.Jian Ling ... Jianwei Zhu
01 Jun 2009
01 Jun 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Emotional speaker identification using a novel capsule nets model

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications