Abstract

Current state-of-the-art automatic speaker verification (ASV) systems are prone to spoofing. The security and reliability of ASV systems can be threatened by different types of spoofing attacks using voice conversion, synthetic speech, or recorded passphrase. It is therefore essential to develop countermeasure techniques which can detect such spoofed speech. Inspired by the success of deep learning approaches in various classification tasks, this work presents an in-depth study of convolutional neural networks (CNNs) for spoofing detection in automatic speaker verification (ASV) systems. Specifically, we have compared the use of three different CNNs architectures: AlexNet, CNNs with max-feature-map activation, and an ensemble of standard CNNs for developing spoofing countermeasures, and discussed their potential to avoid overfitting due to small amounts of training data that is usually available in this task. We used popular deep learning toolkits for the system implementation and have released the implementation code of our methods publicly. We have evaluated the proposed countermeasure systems for detecting replay attacks on recently released spoofing corpora ASVspoof 2017, and also provided in-depth visual analyses of CNNs to aid for future research in this area.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.