Abstract
Most of the voice spoofing detection methods are designed for specific kinds of spoofing attacks, synthetic or replay. In practice, however, there is no prior information about these two kinds of spoofing attacks. To this end, this paper proposes a generalized voice spoofing detection method based on integral knowledge amalgamation to detect jointly synthetic attacks and replay attacks. Two amalgamation mechanisms, feature amalgamation and structure amalgamation, are designed from different perspectives, so that the model can generalize better and run fast. Specifically, the feature amalgamation transfers the high-level sematic knowledge from two teacher models to the compact model. The structure amalgamation employs the adversarial learning to ensure the global structure consistency of two teacher models and a student model. In addition, the feature matching loss is introduced to capture the distinctive features of synthetic attacks and replay attacks. We conduct extensive experiments on logical access (LA) scenario and physical access (PA) scenario of ASVspoof 2019 dataset to verify the validity of the proposed method. The experimental results show that compared with the most advanced generalized voice spoofing detection methods, the proposed method achieves a comparable or even better performance. In particular, our method gains the state-of-the-art detection capability on LA scenario. Moreover, our method achieves similar or even outstanding detectability when compared with specialized anti-spoofing methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.