Abstract

Adversarial training, coupled with loss regularization techniques (such as MART and TRADES), is the current most effective method for consistently achieving adversarial robustness on various known datasets. However, without extra training data, the robustness gains of adversarial training are limited. To overcome this limitation, several alternative defenses were proposed as potential extensions to adversarial training, the most notable being the recent denoising diffusion models to generate additional training data. In this work, we propose a different candidate defense that combines both adversarial retraining and input transformation techniques, but instead, applies the transformations between the architecture layers without the need for extra training data. Namely, our interlayer processing technique introduces bit-depth reduction, originally an input pre-processing technique, between the layers of the model to reduce the space for an adversary to exploit. Together with adversarial training and randomization in the forward pass, interlayer processing leads to higher robustness gains. Our experiments show that our defense improves over standard loss regularization techniques by 1.56% and 7.92% for the ResNet-18 model on the CIFAR-10 and SVHN datasets, respectively. Additional experiments on ResNet-34 architecture led to improvements of 1.96% and 10.42% on CIFAR-100 and SVHN, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call