Abstract

The current adversarial attacks against machine learning models can be divided into white-box attacks and black-box attacks. Further the black-box can be subdivided into soft label and hard label black-box, but the latter has the deficiency of only returning the class with the highest prediction probability, which leads to the difficulty in gradient estimation. However, due to its wide application, it is of great research significance and application value to explore hard label blackbox attacks. This paper proposes an Automatic Selection Attacks Framework (ASAF) for hard label black-box models, which can be explained in two aspects based on the existing attack methods. Firstly, ASAF applies model equivalence to select substitute models automatically so as to generate adversarial examples and then completes black-box attacks based on their transferability. Secondly, specified feature selection and parallel attack method are proposed to shorten the attack time and improve the attack success rate. The experimental results show that ASAF can achieve more than 90% success rate of nontargeted attack on the common models of traditional dataset ResNet-101 (CIFAR10) and InceptionV4 (ImageNet). Meanwhile, compared with FGSM and other attack algorithms, the attack time is reduced by at least 89.7% and 87.8% respectively in two traditional datasets. Besides, it can achieve 90% success rate of attack on the online model, BaiduAI digital recognition. In conclusion, ASAF is the first automatic selection attacks framework for hard label blackbox models, in which specified feature selection and parallel attack methods speed up automatic attacks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call