Abstract
Deep network-based keyword spotting (KWS) has embraced great success in many speech assistant applications. However, such network-based KWS systems were demonstrated vulnerable to adversarial attacks. In this work, we propose to utilize a conditional generative adversarial network (CGAN) to efficiently craft targeted speech adversarial examples. Specifically, we first transform the attacking target label into a vector, which is treated as the condition input of CGAN. The generator in CGAN is tasked to generate perturbation that could make the adversarial example misclassified as the pre-specified target keyword, while simultaneously deceiving the discriminator to misclassify the adversarial example as genuine. The discriminator aims to differentiate the crafted adversarial examples from the legitimate samples. Secondly, the target network-based KWS classifier(s) are ensembled and integrated into the proposed CGAN framework to enforce the generator to construct model-independent perturbation. The classification error loss of the target KWS is back-propagated through gradients for guiding the weight update of the generator. Finally, with properly devised network architecture and training procedure, we obtain a well-trained generator that generates the adversarial perturbation for a given speech clip and target label. Experimental results show that the crafted adversarial examples could effectively attack the state-of-the-art KWS system with quite a high attack success rate, while attaining acceptable perception quality.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.