Abstract
Most of the existing feature representations for spoofing countermeasures consider information either from the magnitude or phase spectrum. We hypothesize that both magnitude and phase spectra can be beneficial for spoofing detection (SD) when collectively used to capture the signal artifacts. In this work, we propose a novel feature referred to as modified magnitude-phase spectrum (MMPS) to capture both magnitude and phase information from the speech signal. The constant-Q transform is used to obtain the magnitude and phase information in terms of MMPS, which can be denoted as CQT-MMPS. We then use this information for the proposal of a handcrafted feature, namely, constant-Q modified octave coefficients (CQMOC). To evaluate the proposed CQT-MMPS and CQMOC features, three classic anti-spoofing models are adopted, including the Gaussian mixture model (GMM), the light CNN (LCNN) and the ResNet. Additionally, since there is usually no prior knowledge about the spoofing kind in real-world applications, two novel methods referred to as three-class classifiers with maximum spoofing-score (TCMS) and multi-task learning (MTL) are designed for unknown-kind SD (UKSD). The experimental results on ASVspoof 2019 corpus show that CQMOC outperforms most of the commonly-used handcrafted features, and the CQT-based MMPS performs better than the magnitude-phase spectrum and the commonly-used log power spectrum. Further, the MMPS-based systems can achieve comparable or even better performance when compared with the state-of-the-art systems. We find that the newly-designed TCMS and MTL methods outperform the combination-based method for UKSD and meanwhile, generalize much better than the respective-kind-based methods in cross-spoofing-kind evaluation scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.