Abstract

Adversarial attacks can fool convolutional networks and make the systems vulnerable to fraud and deception. How to defend against malicious attacks is a critical challenge in practice. Adversarial attacks are often conducted by adding tiny perturbations on images to cause network misclassification. Noise reduction can defend the attacks; however, it is not suited for all the cases. Considering that different models have different tolerance abilities on adversarial attacks, we develop a novel detecting module to remove noise by adaptive process and detect adversarial attacks without modifying the models. Experimental results show that by comparing the classification results on adversarial samples of MNIST and two subclasses of ImageNet datasets, our models can successfully remove most of the noise and obtain detection accuracies of 97.71% and 92.96%, respectively. Furthermore, our adaptive module can be assembled into different networks to achieve detection accuracies of 70.83% and 71.96%, respectively, on the white-box adversarial attacks of ResNet18 and SCD01MLP images. The best accuracy of 62.5% is obtained for both networks when dealing with the black-box attacks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.