A Pre-Training Framework Based on Multi-Order Acoustic Simulation for Replay Voice Spoofing Detection.

Changhwan Go,Oc-Yeub Jeon,Nam In Park,Chanjun Chun

doi:10.3390/s23167280

Changhwan Go, Oc-Yeub Jeon + Show 2 more

Open Access

https://doi.org/10.3390/s23167280

Copy DOI

Journal: Sensors	Publication Date: Aug 20, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Chosun University, National Forensic Institute

Abstract

Voice spoofing attempts to break into a specific automatic speaker verification (ASV) system by forging the user's voice and can be used through methods such as text-to-speech (TTS), voice conversion (VC), and replay attacks. Recently, deep learning-based voice spoofing countermeasures have been developed. However, the problem with replay is that it is difficult to construct a large number of datasets because it requires a physical recording process. To overcome these problems, this study proposes a pre-training framework based on multi-order acoustic simulation for replay voice spoofing detection. Multi-order acoustic simulation utilizes existing clean signal and room impulse response (RIR) datasets to generate audios, which simulate the various acoustic configurations of the original and replayed audios. The acoustic configuration refers to factors such as the microphone type, reverberation, time delay, and noise that may occur between a speaker and microphone during the recording process. We assume that a deep learning model trained on an audio that simulates the various acoustic configurations of the original and replayed audios can classify the acoustic configurations of the original and replay audios well. To validate this, we performed pre-training to classify the audio generated by the multi-order acoustic simulation into three classes: clean signal, audio simulating the acoustic configuration of the original audio, and audio simulating the acoustic configuration of the replay audio. We also set the weights of the pre-training model to the initial weights of the replay voice spoofing detection model using the existing replay voice spoofing dataset and then performed fine-tuning. To validate the effectiveness of the proposed method, we evaluated the performance of the conventional method without pre-training and proposed method using an objective metric, i.e., the accuracy and F1-score. As a result, the conventional method achieved an accuracy of 92.94%, F1-score of 86.92% and the proposed method achieved an accuracy of 98.16%, F1-score of 95.08%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Pre-Training Framework Based on Multi-Order Acoustic Simulation for Replay Voice Spoofing Detection.

Abstract

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

A robust voice spoofing detection system using novel CLS-LBP features and LSTM
Hussain Dawood ... Ali Javed
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Hussain Dawood, et. al.Hussain Dawood ... Ali Javed
22 Mar 2022
Journal of King Saud University - Computer and Information Sciences | VOL. 34

On the vulnerability of speaker verification to realistic voice spoofing
Serife Kucur Ergunay ... Alexandros Lazaridis
-
Serife Kucur Ergunay, et. al.Serife Kucur Ergunay ... Alexandros Lazaridis
01 Sep 2015
01 Sep 2015

One-Class Learning Towards Synthetic Voice Spoofing Detection
You Zhang ... Fei Jiang
IEEE Signal Processing Letters | VOL. 28
You Zhang, et. al.You Zhang ... Fei Jiang
01 Jan 2020
IEEE Signal Processing Letters | VOL. 28

Voice Spoofing Countermeasure for Synthetic Speech Detection
Farman Hassan ... Ali Javed
-
Farman Hassan, et. al.Farman Hassan ... Ali Javed
05 Apr 2021
05 Apr 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Pre-Training Framework Based on Multi-Order Acoustic Simulation for Replay Voice Spoofing Detection.

Abstract

Talk to us

Similar Papers

More From: Sensors