Abstract

Machine learning models are known to be vulnerable to malicious attacks, such as adversarial attacks, data poisoning attacks and backdoor attacks. A model injected by a backdoor attack can work adequately under normal conditions but the malicious behaviour is evident on samples with specific triggers. In most existing backdoor attacks, the trigger is manually defined, which results in a less-than-optimal attack success rate, a degree of stealthiness and a high poison ratio. To address these limitations, first, we reformulate the goal as a problem of determining the boundaries of a neural network. Then, we propose a dynamic invisible trigger algorithm (DIT) for determining the neural network’s decision boundaries. Then, based on DIT, we propose DIHBA, Dynamic Invisible and High attack success rate with low poison ratio Boundaries Backdoor Attack, in which we use the decision boundary images generated by DIT as trigger images. Finally, we analyse the stealthiness, attack success rate and poison ratio of DIHBA with the MNIST, Fashion-MNIST, CIFAR10, GTSRB and Cat-face datasets. Compared to BadNets Gu et al.(2019), HB Saha et al.(2020), ROBNET Gong et al.(2021) and Poison Ink Zhang et al.(2022), DIHBA is not only stealthy but also has a high attack success rate, with low poison ratios of 0%, 0.1%, 0.5%, 20% and 25% for attacking different precision networks. In addition, DIHBA has strong resistance to the state-of-the-art backdoor defence systems Strip Gao et al.(2019) and Neural Cleanse Wang et al.(2019).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call