Speaker Verification Problem Research Articles

As an attractive research in biometric authentication, Text Independent Speaker Verification (TI-SV) problem aims to specify whether two given unconstrained utterances come from the same speaker or not. As state-of-the-art solutions, end-to-end approaches using deep neural networks seek to learn a highly discriminative speaker embedding space.In this paper, we propose a novel end-to-end approach for speaker embedding learning by focusing on two crucial factors: speaker embedder architecture and objective function. The proposed module in the speaker embedder is composed of an Efficient Multi-resolution feature Representation (EMR) block followed by a Multi-scale Channel Attention Fusion (MCAF) block. The EMR effectively addresses the issue of fixed resolution convolutional kernels which commonly used in most embedder architectures. Moreover, the MCAF significantly improves the simple summation-based feature fusion used in residual embedder networks. Regarding the objective function, we conduct the speaker embedding space towards learning the embedding-to-embedding relations, in addition to only embedding-to-training class relations employed by most previous methods. So, we propose to employ a dynamic graph attention network, on top of the proposed embedder to learn all informative relations between embeddings, and then learn both embedder and graph-based networks in an end-to-end manner.We conduct various experiments on a large-scale benchmark dataset called VoxCeleb1&2. The effectiveness of all proposed components is verified through an ablation study. We show the superior or competitive performances of the proposed approach compared to seven well-known embedding architectures and 32 SV systems, regarding two evaluation metrics, EER and minDCF, as well as the number of embedder parameters.

This paper investigates the problem of speaker identification and verification in noisy conditions, assuming that speech signals are corrupted by environmental noise, but knowledge about the noise characteristics is not available. This research is motivated in part by the potential application of speaker recognition technologies on handheld devices or the Internet. While the technologies promise an additional biometric layer of security to protect the user, the practical implementation of such systems faces many challenges. One of these is environmental noise. Due to the mobile nature of such systems, the noise sources can be highly time-varying and potentially unknown. This raises the requirement for noise robustness in the absence of information about the noise. This paper describes a method that combines multicondition model training and missing-feature theory to model noise with unknown temporal-spectral characteristics. Multicondition training is conducted using simulated noisy data with limited noise variation, providing a ldquocoarserdquo compensation for the noise, and missing-feature theory is applied to refine the compensation by ignoring noise variation outside the given training conditions, thereby reducing the training and testing mismatch. This paper is focused on several issues relating to the implementation of the new model for real-world applications. These include the generation of multicondition training data to model noisy speech, the combination of different training data to optimize the recognition performance, and the reduction of the model's complexity. The new algorithm was tested using two databases with simulated and realistic noisy speech data. The first database is a redevelopment of the TIMIT database by rerecording the data in the presence of various noise types, used to test the model for speaker identification with a focus on the varieties of noise. The second database is a handheld-device database collected in realistic noisy conditions, used to further validate the model for real-world speaker verification. The new model is compared to baseline systems and is found to achieve lower error rates.

Speaker Verification Problem Research Articles

Related Topics

Articles published on Speaker Verification Problem

End-to-end deep speaker embedding learning using multi-scale attentional fusion and graph neural networks

A speaker verification backend with robust performance across conditions

Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification

A Speaker Verification Method Based on TDNN–LSTMP

Robust Speaker Identification and Verification in Adverse Acoustic Condition

Performance comparison of multitaper techniques for speaker verification with expressive speech

Feature selection for speaker verification using genetic programming

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Manifold learning based speaker dependent dimension reduction for robust text independent speaker verification

Using Polynomial Kernel Support Vector Machines for Speaker Verification

A nonlinear autoregressive model for speaker verification

Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models

Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise

A test of the effectiveness of speaker verification for differentiating between identical twins

Robust Speaker Recognition in Noisy Conditions

A false acceptance error controlling method for hyperspherical classifiers

A FUZZY APPROACH TO SPEAKER VERIFICATION

Speaker adaptations in sparse training data for improved speaker verification

Speaker verification—problems in small systems

Text-dependent speaker verification using vector quantization source coding

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speaker Verification Problem Research Articles

Related Topics

Articles published on Speaker Verification Problem

End-to-end deep speaker embedding learning using multi-scale attentional fusion and graph neural networks

A speaker verification backend with robust performance across conditions

Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification

A Speaker Verification Method Based on TDNN–LSTMP

Robust Speaker Identification and Verification in Adverse Acoustic Condition

Performance comparison of multitaper techniques for speaker verification with expressive speech

Feature selection for speaker verification using genetic programming

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Manifold learning based speaker dependent dimension reduction for robust text independent speaker verification

Using Polynomial Kernel Support Vector Machines for Speaker Verification

A nonlinear autoregressive model for speaker verification

Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models

Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise

A test of the effectiveness of speaker verification for differentiating between identical twins

Robust Speaker Recognition in Noisy Conditions

A false acceptance error controlling method for hyperspherical classifiers

A FUZZY APPROACH TO SPEAKER VERIFICATION

Speaker adaptations in sparse training data for improved speaker verification

Speaker verification—problems in small systems

Text-dependent speaker verification using vector quantization source coding