Deep correlation network for synthetic speech detection

Chen Chen,Bohan Dai,Deyun Chen,Bochao Bai

doi:10.1016/j.asoc.2024.111413

Abstract

Synthetic speech is becoming increasingly rampant, and automatic speaker verification (ASV) systems are vulnerable to its attacks. However, most current synthetic speech detection methods focus on the influence of a single feature in the detection. Since different features can represent the difference between real speech and synthetic speech to a certain extent, there must be common information between different types of features. Effectively finding and fully utilizing this information will facilitate the extraction of better discriminative features and achieve improved performance. Based on the above analysis, we propose a deep correlation network (DCN) to learn the latent common information between different embeddings. It consists of two parts, the bi-parallel network and the correlation learning network. Bi-parallel networks consist of different neural models to learn the middle-level representations from front-end acoustical features. The correlation learning network is the core part of the DCN and is proposed to explore the common information between the above middle-level features. The common information obtained after DCN processing have better discriminative ability for synthetic speech detection. Experimental results show that the proposed DCN can significantly improve the performance of synthetic speech detection system on ASVspoof 2019 and ASVspoof 2021 logical access sub-challenge.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep correlation network for synthetic speech detection

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Journal: Applied Soft Computing	Publication Date: Feb 17, 2024
Citations: 1

Similar Papers

Synthetic speech detection using fundamental frequency variation and spectral features
Monisankha Pal ... Goutam Saha
Computer Speech & Language | VOL. 48
Monisankha Pal, et. al.Monisankha Pal ... Goutam Saha
13 Oct 2017
Computer Speech & Language | VOL. 48

Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks
Zhenzong Wu ... Rohan Kumar Das
-
Zhenzong Wu, et. al.Zhenzong Wu ... Rohan Kumar Das
25 Oct 2020
25 Oct 2020

Voice Spoofing Countermeasure for Synthetic Speech Detection
Farman Hassan ... Ali Javed
-
Farman Hassan, et. al.Farman Hassan ... Ali Javed
05 Apr 2021
05 Apr 2021

Vulnerability issues in Automatic Speaker Verification (ASV) systems
Priyanka Gupta ... Rodrigo Capobianco Guido
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2024
Priyanka Gupta, et. al.Priyanka Gupta ... Rodrigo Capobianco Guido
10 Feb 2024
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2024

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep correlation network for synthetic speech detection

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing