Abstract

Automatic Speech Recognition (ASR) provides a new way of human-computer interaction. However, it is vulnerable to adversarial examples, which are obtained by deliberately adding perturbations to the original audios. Thorough studies on the universal feature of adversarial examples are essential to prevent potential attacks. Previous research has shown classic adversarial examples have different logits distribution compared to normal speech. This paper proposes a logit-traction attack to eliminate this difference at the statistical level. Experiments on the LibriSpeech dataset show that the proposed attack reduces the accuracy of the LOGITS NOISE detection to 52.1%. To further verify the effectiveness of this approach in attacking detection based on logits, three different features quantifying the dispersion of logits are constructed in this paper. Furthermore, a richer target sentence is adopted for experiments. The results indicate that these features can detect baseline adversarial examples with an accuracy of about 90% but cannot effectively detect Logits-Traction adversarial examples, proving that Logits-Traction attack can evade the logits-based detection method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.