Regularized Within-Class Precision Matrix Based PLDA in Text-Dependent Speaker Verification

Sung-Hyun Yoon,Ha-Jin Yu,Jong-June Jeon

doi:10.3390/app10186571

Sung-Hyun Yoon, Ha-Jin Yu + Show 1 more

Open Access

https://doi.org/10.3390/app10186571

Copy DOI

Journal: Applied Sciences	Publication Date: Sep 20, 2020
Citations: 3	License type: CC BY 4.0

Affiliation: University of Seoul

Abstract

In the field of speaker verification, probabilistic linear discriminant analysis (PLDA) is the dominant method for back-end scoring. To estimate the PLDA model, the between-class covariance and within-class precision matrices must be estimated from samples. However, the empirical covariance/precision estimated from samples has estimation errors due to the limited number of samples available. In this paper, we propose a method to improve the conventional PLDA by estimating the PLDA model using the regularized within-class precision matrix. We use graphical least absolute shrinking and selection operator (GLASSO) for the regularization. The GLASSO regularization decreases the estimation errors in the empirical precision matrix by making the precision matrix sparse, which corresponds to the reflection of the conditional independence structure. The experimental results on text-dependent speaker verification reveal that the proposed method reduce the relative equal error rate by up to 23% compared with the conventional PLDA.

Highlights

Automatic speaker verification (ASV) is a technique to verify a user’s identity by comparing an utterance of a user with the reference utterance of a known target speaker
The dashed red line depicts the equal error rates (EERs) of the original probabilistic linear discriminant analysis (PLDA), and the solid blue line depicts the EERs of the graphical least absolute shrinking and selection operator (GLASSO)-PLDA
We improved the conventional PLDA by proposing the GLASSO-PLDA, in which the GLASSO-regularized within-class precision matrix was used to estimate the PLDA model

Summary

Introduction

Automatic speaker verification (ASV) is a technique to verify a user’s identity by comparing an utterance of a user (test utterance) with the reference utterance of a known target speaker (enrollment utterance). It is relatively easy to control the phrase variability in TD-SV because of the limitation for the phrase Due to these advantages, TD-SV has been widely used in many real applications that require both the higher performance and short utterance, such as voice assistant [3,4]. The score is computed in a more discriminative subspace to compensate for the within-class variability of the embedding [14]. The empirical covariance/precision matrix has estimation errors because of the limited number of available samples (corresponding to the embeddings in our case). We propose a method to improve the performance of conventional PLDA by regularizing the within-class precision matrix used to estimate the PLDA model.

I-Vector

Deep Speaker Embeddings

Gaussian Markov Random Field

GLASSO

GLASSO Applied PLDA

Prerequisite

Database

Experimental Setup

Results

Evaluation in Text-Independent Speaker Verification

Comparison with Matrix Banding

Conclusions

August

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Regularized Within-Class Precision Matrix Based PLDA in Text-Dependent Speaker Verification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

A fuzzy‐clustering‐based hierarchical i‐vector/probabilistic linear discriminant analysis system for text‐dependent speaker verification
Mohammad Azharuddin Laskar ... Rabul Hussain Laskar
Expert Systems | VOL. 37
Mohammad Azharuddin Laskar, et. al.Mohammad Azharuddin Laskar ... Rabul Hussain Laskar
30 Jan 2020
Expert Systems | VOL. 37

Investigating and improving the utility of probabilistic linear discriminant analysis for acoustic signal classification
Yuechi Jiang ... Frank H.F Leung
Digital Signal Processing | VOL. 114
Yuechi Jiang, et. al.Yuechi Jiang ... Frank H.F Leung
15 Apr 2021
Digital Signal Processing | VOL. 114

Generalized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification
Ming Li ... Wenbo Liu
Journal of Signal Processing Systems | VOL. 82
Ming Li, et. al.Ming Li ... Wenbo Liu
02 Jul 2015
Journal of Signal Processing Systems | VOL. 82

Large-scale speaker search using PLDA on mismatched conditions
Jeff Ma ... Owen Kimball
-
Jeff Ma, et. al.Jeff Ma ... Owen Kimball
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Regularized Within-Class Precision Matrix Based PLDA in Text-Dependent Speaker Verification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences