Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss

Jiakang Li,Meng Sun,Xiongwei Zhang,Yimin Wang

doi:10.1109/access.2020.2964048

Abstract

Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which voice is recorded, stored and replayed to deceive ASV systems. Cascaded decision of anti-spoofing and ASV is a straightforward solution to tackle the two issues. In this paper, joint decision of anti-spoofing and ASV was investigated in a multi-task learning framework with contrastive loss in order to improve the cascaded decision approach. A modified triplet loss was firstly constructed to supervise deep neural networks to extract embedding vectors containing information of both speaker identity and spoofing. The embedding vectors were subsequently taken as input features by back-end classifiers towards speaker and spoofing classification. The experimental results on both ASVspoof 2017 and ASVspoof 2019 showed that the proposed joint decision approach with triplet loss outperformed the corresponding baselines, a recent work on joint decision with Gaussian back-end fusion and our previous joint decision approach with cross-entropy loss.

Highlights

With the development of engineering applications of artificial intelligence, biometric authentication is becoming popular in scenario of protecting the security of computers, smart devices, and networks, such as fingerprint and face recognition
TABLE 6 presents the Equal error rate (EER) results obtained from different features with convolutional neural network (CNN), deep neural networks (DNN) or time-delay deep neural network (TDNN) networks based on cross-entropy or triplet loss on ASVspoof 2017
For anti-spoofing, our system achieved an EER of 11.89% by using mel-frequency cepstral coefficients (MFCC) and TDNN with triplet loss compared to 24.35% in [17]

Summary

Introduction

With the development of engineering applications of artificial intelligence, biometric authentication is becoming popular in scenario of protecting the security of computers, smart devices, and networks, such as fingerprint and face recognition. Automatic speaker verification (ASV) is a conventional way to put voiceprint into practical usage, where it verifies the claimed identity of a speaker by recording voices, extracting voiceprints and computing similarities. Recognition system by showing a photo of an authenticated user to the camera, or attacking an ASV system by playing back a recording of a verified user [3], [4]. Since replay attacks are easy to implement and highly similar to bona fide speech, it is difficult to detect and bring serious threats to ASV systems [5]. Anti-spoofing should be considered carefully before putting ASV into practical usage

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 40	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Privacy-preserving PLDA speaker verification using outsourced secure computation
Amos Treiber ... Christoph Busch
Speech Communication | VOL. 114
Amos Treiber, et. al.Amos Treiber ... Christoph Busch
01 Oct 2019
Speech Communication | VOL. 114

Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection
Jiakang Li ... Meng Sun
-
Jiakang Li, et. al.Jiakang Li ... Meng Sun
01 Nov 2019
01 Nov 2019

Waveform level adversarial example generation for joint attacks against both automatic speaker verification and spoofing countermeasures
Xingyu Zhang ... Jian Zhao
Engineering Applications of Artificial Intelligence | VOL. 116
Xingyu Zhang, et. al.Xingyu Zhang ... Jian Zhao
11 Oct 2022
Engineering Applications of Artificial Intelligence | VOL. 116

VarASV: Enabling Pitch-variable Automatic Speaker Verification via Multi-task Learning
Yizhuo Gao ... Chaohao Li
-
Yizhuo Gao, et. al.Yizhuo Gao ... Chaohao Li
22 Oct 2021
22 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access