Abstract
Biometric speech recognition systems are often subject to various spoofing attacks, the most common of which are speech synthesis and speech conversion attacks. These spoofing attacks can cause the biometric speech recognition system to incorrectly accept these spoofing attacks, which can compromise the security of this system. Researchers have made many efforts to address this problem, and the existing studies have used the physical features of speech to identify spoofing attacks. However, recent studies have shown that speech contains a large number of physiological features related to the human face. For example, we can determine the speaker’s gender, age, mouth shape, and other information by voice. Inspired by the above researches, we propose a spoofing attack recognition method based on physiological-physical features fusion. This method involves feature extraction, a densely connected convolutional neural network with squeeze and excitation block (SE-DenseNet), and feature fusion strategies. We first extract physiological features in audio from a pre-trained convolutional network. Then we use SE-DenseNet to extract physical features. Such a dense connection pattern has high parameter efficiency, and squeeze and excitation blocks can enhance the transmission of the feature. Finally, we integrate the two features into the classification network to identify the spoofing attacks. Experimental results on the ASVspoof 2019 data set show that our model is effective for voice spoofing detection. In the logical access scenario, our model improves the tandem decision cost function and equal error rate scores by 5% and 7%, respectively, compared to existing methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.