Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Barlian Henryranu Prasetio,Hiroki Tamura,Koichi Tanno

doi:10.3390/electronics9091420

Barlian Henryranu Prasetio, Hiroki Tamura + Show 1 more

Open Access

https://doi.org/10.3390/electronics9091420

Copy DOI

Journal: Electronics	Publication Date: Sep 1, 2020
Citations: 2	License type: CC BY 4.0

Affiliation: University of Miyazaki

Abstract

Emotional conditions cause changes in the speech production system. It produces the differences in the acoustical characteristics compared to neutral conditions. The presence of emotion makes the performance of a speaker verification system degrade. In this paper, we propose a speaker modeling that accommodates the presence of emotions on the speech segments by extracting a speaker representation compactly. The speaker model is estimated by following a similar procedure to the i-vector technique, but it considerate the emotional effect as the channel variability component. We named this method as the emotional variability analysis (EVA). EVA represents the emotion subspace separately to the speaker subspace, like the joint factor analysis (JFA) model. The effectiveness of the proposed system is evaluated by comparing it with the standard i-vector system in the speaker verification task of the Speech Under Simulated and Actual Stress (SUSAS) dataset with three different scoring methods. The evaluation focus in terms of the equal error rate (EER). In addition, we also conducted an ablation study for a more comprehensive analysis of the EVA-based i-vector. Based on experiment results, the proposed system outperformed the standard i-vector system and achieved state-of-the-art results in the verification task for the under-stressed speakers.

Highlights

Speaker verification is the process of accepting or rejecting the identity claim of a speaker [1].This system is commonly used for the applications that use the voice as the identity confirmation, known as biometrics, natural language technologies [2] or as a pre-processing part of the speaker-dependent system, such as conversational-based algorithms [3,4]
The Mahalanobis distance scoring (MDS) seeks to measure the correlation between variables by assuming an anisotropic Gaussian distribution instead
Since emotional variability analysis (EVA) compensates for the emotions, there are some correlations between emotion and speaker’s supervector

Summary

Introduction

Speaker verification is the process of accepting or rejecting the identity claim of a speaker [1]. This system is commonly used for the applications that use the voice as the identity confirmation, known as biometrics, natural language technologies [2] or as a pre-processing part of the speaker-dependent system, such as conversational-based algorithms [3,4]. Many methods have been explored in terms of verification task [5]. Just a little work that observed the effects of the emotional conditions on the speech characteristics. Emotional conditions (especially stress conditions) are the most crucial factor that highly impacted the voice tone’s characteristics

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions
Ali Bou Nassif ... Keikichi Hirose
Applied Soft Computing | VOL. 103
Ali Bou Nassif, et. al.Ali Bou Nassif ... Keikichi Hirose
08 Feb 2021
Applied Soft Computing | VOL. 103

Novel hybrid DNN approaches for speaker verification in emotional and stressful talking environments
Ismail Shahin ... Adi Alhudhaif
Neural Computing and Applications | VOL. 33
Ismail Shahin, et. al.Ismail Shahin ... Adi Alhudhaif
22 Jun 2021
Neural Computing and Applications | VOL. 33

Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections
...
-
, et. al. ...
04 Sep 2006
04 Sep 2006

Variational Bayesian Joint Factor Analysis Models for Speaker Verification
Xianyu Zhao ... Yuan Dong
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 20
Xianyu Zhao, et. al.Xianyu Zhao ... Yuan Dong
01 Mar 2012
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Emotional Variability Analysis Based I-Vector for Speaker Verification in Under-Stress Conditions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics