Abstract
Deep neural network-based keyword spotting (KWS) have embraced the tremendous success in smart speech assistant applications. However, the neural network-based KWS have been demonstrated susceptible to be attacked by the adversarial examples. The investigation of efficient adversarial generation would mitigate the security flaws of network-based KWS via adversarial training. In this paper, we propose to use the conditional generative adversarial network (CGAN) to efficiently generate speech adversarial examples. Specifically, we first present a target label embedding method to map the class-wise label into feature maps. Then, we utilize generative adversarial network for constructing the target speech adversarial examples with such feature maps. The target KWS classification network is then integrated with CGAN framework, where the classification error of the target network is back-propagated via gradient flow to guide the generator updating, but the target network itself is frozen. The proposed method is evaluated on a set of state-of-the-art deep learning-based KWS classification networks. The results validate the effectiveness of the generated adversarial examples. In addition, experimental results also demonstrate that the transferability of generated adversarial example among the different KWS classification networks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.