Abstract
With the development of speech synthesis and voice conversion techniques, the quality of artificially generated speech has been significantly improved and detecting such spoofing speech becomes crucial to practical applications, such as automatic speaker verification (ASV). State-of-the-art neural-network-based spoofing detection models can distinguish most artificial utterances from natural ones effectively in the latest ASVspoof 2019 evaluation. Motivated by recent progresses of adversarial example generation, this paper studies the robustness of neural-network-based speech spoofing detectors against adversarial attacks. To this end, an adversarial post-processing network (APN) is proposed which generates adversarial examples against a white-box anti-spoofing model by post-processing the speech waveforms produced by a baseline voice conversion system. Experimental results demonstrate the adversarial ability of our proposed APNs against the white-box anti-spoofing models which were used as the adversarial targets of APNs at the training stage. For example, the equal error rate (EER) of a fused detection model based on light convolution neural networks (LCNNs) increased from 0.278% to 12.743% under the white-box condition without degrading the subjective quality of converted speech. Furthermore, the trained APNs can also perform against the detectors with either unseen structures or unseen features by raising their EERs in our experiments. All these results indicate the threat of adversarial speech generation to the performance of state-of-the-art spoofing detection models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.