Abstract

Training high-performance audio models requires a large corpus of training samples, expensive computational resources, and expert knowledge. These costs are computationally intensive for individuals with limited resources. Consequently, users may turn to third-party resources, e.g., outsourcing the training to powerful cloud servers or automatically scraping data from the Internet. While these available resources provide a playground for developing audio models, a concerning fact is that malicious third parties may render users vulnerable to data poisoning and backdoor attacks. These attacks can seriously undermine the security and usability of the system supported by the audio model, sometimes with catastrophic consequences. In this paper, we review the existing schemes of the backdoor and data poisoning attacks on audio intelligence systems. We classify the state-of-the-art attack schemes into three categories based on their goals, i.e., untargeted poisoning attacks, triggerless attacks, and backdoor attacks. We briefly introduce the state-of-the-art solutions, followed by a comprehensive comparison. Moreover, we quantitatively compare several attack methods in terms of the performance of the attacks and the inaudibility of the poisoned examples. Finally, we highlight some promising future research directions in this field.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call