This paper proposes methods to upstage the best-known defences against query-based black-box attacks. These benchmark defences incorporate gaussian noise into input data during inference to achieve state-of-the-art performance in protecting image classification models against the most advanced query-based black-box attacks. Even so there is a need to improve upon them; for example, the widely benchmarked Random noise defense (RND) method has demonstrated limited robustness – achieving only 53.5% and 18.1% with a ResNet-50 model on the CIFAR-10 and ImageNet datasets, respectively – against the square attack, which is commonly regarded as the state-of-the-art black-box attack. Therefore, in this work, we propose two alternatives to gaussian noise addition at inference time: random crop-resize and random rotation of the input images. Although these transformations are generally used for data augmentation while training to improve model invariance and generalisation, their protective potential against query-based black-box attacks at inference time is unexplored. Therefore, for the first time, we report that for such well-trained models either of the two transformations can also blunt powerful query-based black-box attacks when used at inference time on three popular datasets. The results show that the proposed randomised transformations outperform RND in terms of robust accuracy against a strong adversary that uses a high budget of 100,000 queries based on expectation over transformation (EOT) of 10, by 0.9% on the CIFAR-10 dataset, 9.4% on the ImageNet dataset and 1.6% on the Tiny ImageNet dataset. Crucially, in two even tougher attack settings, that is, high-confidence adversarial examples and EOT-50 adversary, these transformations are even more effective as the margin of improvement over the benchmarks increases further.
Read full abstract