Abstract

Voice controlled systems (VCS) in Internet of Things (IoT), speaker verification systems, voice-based biometrics, and other voice-assistant-enabled systems are vulnerable to different spoofing attacks i.e., replay, cloning, cloned-replay, etc. VCS are not only susceptible to these attacks in a non-network environment, but they are also vulnerable to multi-order spoofing attacks in networked IoT. Additionally, deepfakes with artificially generated audio pose a great threat to the all systems having voice-interfaces. Most of the existing countermeasures against these voice spoofing attacks work for only one specific attack (e.g. voice replay) and fail to generalize this for other classes of spoofing attacks. Additionally, generalization is also crucial for cross-corpora evaluation. Thus, there exists a need to develop a unified voice anti-spoofing framework capable of detecting multiple spoofing attacks. This work presents a unified anti-spoofing framework that uses novel (ATCoP-GTCC) features to combat the variety of voice spoofing attacks. The proposed novel acoustic-ternary co-occurrence patterns (ATCoP) encode the co-occurrence of similar patterns between the center and neighboring samples. Our experiments demonstrate that ATCoP can better capture the microphone induced distortions in replays, unnatural prosody and algorithmic artifacts in cloned samples, and both the distortions and artifacts in cloned-replays including compression on multi-hop attacks in the spoofing samples. The performance of ATCoP could be further enhanced by the Gammatone cepstral coefficients. To evaluate the effectiveness of the proposed anti-spoofing system for multi-order replay and cloned-replay attacks detection, we created a diverse voice spoofing detection corpus (VSDC) containing multi-order replay and cloned-replay audios against the bonafide and cloned audio recordings, respectively. Experimental results obtained on VSDC, ASVspoof 2019, Google’s LJ Speech, and YouTube deepfakes datasets illustrate the effectiveness of the proposed system in terms of accurate detection for a variety of voice spoofing attacks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.