Addressing Text-Dependent Speaker Verification Using Singing Speech

Yan Shi,Yanhua Long,Juanjuan Zhou,Hongwei Mao,Yijie Li

doi:10.3390/app9132636

Abstract

The automatic speaker verification (ASV) has achieved significant progress in recent years. However, it is still very challenging to generalize the ASV technologies to new, unknown and spoofing conditions. Most previous studies focused on extracting the speaker information from natural speech. This paper attempts to address the speaker verification from another perspective. The speaker identity information was exploited from singing speech. We first designed and released a new corpus for speaker verification based on singing and normal reading speech. Then, the speaker discrimination was compared and analyzed between natural and singing speech in different feature spaces. Furthermore, the conventional Gaussian mixture model, the dynamic time warping and the state-of-the-art deep neural network were investigated. They were used to build text-dependent ASV systems with different training-test conditions. Experimental results show that the voiceprint information in the singing speech was more distinguishable than the one in the normal speech. More than relative 20% reduction of equal error rate was obtained on both the gender-dependent and independent 1 s-1 s evaluation tasks.

Highlights

Automatic speaker verification (ASV) is the verification of a speaker’s identify based on his/her speech signals [1]
We can observe that the overlap of orange dots and blue circles in the left subfigure is less than that in the right subfigure. This indicates that the speaker discrimination in Mel-frequency cepstral coefficients (MFCCs) feature space of singing speech is larger than the discrimination in reading speech feature space
The performances are reported in terms of equal error rate (EER) [1], a verification error measure that gives the accuracy at decision threshold for which the probabilities of false rejection and false acceptance are equal

Summary

Introduction

Automatic speaker verification (ASV) is the verification of a speaker’s identify based on his/her speech signals [1]. We have not found any previous works that examine and compare the speaker verification performances between using the natural Mandarin reading speech and singing speech. The big difference between this work and the above-mentioned previous works is that we focus on examining and comparing the effectiveness of using normal Mandarin reading speech and their corresponding singing speech for short-time text-dependent speaker verification. We designed a new corpus for short-time text-dependent ASV experiments We released it on the Zenodo website (https://zenodo.org/record/3241566) and put our implementation code in the Github repository (https://github.com/Moonmore/Speaker-Verification) for public research. Based on this corpus, we performed the text-dependent (TD) ASV comparison experiments using either the natural speech or the singing speech, or both of them. Preliminary results show that the voiceprint information in the singing speech was more distinguishable than the one in the natural reading speech for the short-time gender-dependent as well as independent ASV tasks

Corpus

Speaker Identity Discrimination in Different Feature Space

Pitch Discrimination

MFCC Discrimination

Speaker Verification Systems

Experimental Results

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied sciences	Publication Date: Jun 28, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Addressing Text-Dependent Speaker Verification Using Singing Speech

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences

Lead the way for us

Similar Papers

Short-time speaker verification with different speaking style utterances.
Hongwei Mao ... Ian Mcloughlin
PloS one | VOL. 15
Hongwei Mao, et. al.Hongwei Mao ... Ian Mcloughlin
11 Nov 2020
PloS one | VOL. 15

Speaker verification with elicited speaking styles in the VeriVox project
I Karlsson ... K Scherer
Speech Communication | VOL. 31
I Karlsson, et. al.I Karlsson ... K Scherer
01 Jun 2000
Speech Communication | VOL. 31

New Acoustic Features for Synthetic and Replay Spoofing Attack Detection
Linqiang Wei ... Yanhua Long
Symmetry | VOL. 14
Linqiang Wei, et. al.Linqiang Wei ... Yanhua Long
29 Jan 2022
Symmetry | VOL. 14

Two Methods for Spoofing-Aware Speaker Verification: Multi-Layer Perceptron Score Fusion Model and Integrated Embedding Projector
Jungwoo Heo ... Hyun-Seo Shin
-
Jungwoo Heo, et. al.Jungwoo Heo ... Hyun-Seo Shin
18 Sep 2022
18 Sep 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Addressing Text-Dependent Speaker Verification Using Singing Speech

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences