An auxiliary correction model implemented by the Correction Property of intermediate layer against adversarial examples

Xiangyuan Yang,Jie Lin,Hanlin Zhang,Xinyu Yang,Peng Zhao

doi:10.1016/j.asoc.2024.112367

Abstract

In powerful adversarial attacks against deep neural networks (DNN), the generated adversarial example will mislead the DNN-implemented classifier by destroying the features of the last layer. To enhance the robustness of the classifier, in our paper, a Feature Analysis and Conditional Matching prediction distribution (FACM) model is proposed to utilize the features of intermediate layers to correct the misclassification. Specifically, we first prove that the intermediate layers of the classifier still retain effective features for the original category when the classifier is subjected to adversarial attacks, which is defined as the Correction Property in our paper. According to this, we propose the FACM model consisting of Feature Analysis (FA) correction module, Conditional Matching Prediction Distribution (CMPD) correction module and decision module. Specifically, the FA correction module is comprised of fully connected layers, which takes the features of the intermediate layers as the input to correct the misclassification of the classifier. The CMPD correction module is based on a conditional autoencoder, which not only uses the features of intermediate layers as the condition to accelerate convergence but also mitigates the negative effect of adversarial examples, trained with the Kullback–Leibler loss to match prediction distribution. Through the empirically verified Diversity Property among the individual correction modules, the decision module is proposed to integrate the proposed correction modules to enhance the DNN-implemented classifier’s robustness by reducing the dimensionality of adversarial subspace. That is, the input perturbed in certain directions (i.e., dimensions) that lead to misclassifications for the classifier can be correctly classified by the proposed correction modules. The extended experiments demonstrate our FACM model outperforms the existing methods against adversarial attacks, especially optimization-based white-box attacks and query-based black-box attacks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An auxiliary correction model implemented by the Correction Property of intermediate layer against adversarial examples

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Similar Papers

Generating watermarked adversarial texts
Mingjie Li ... Hanzhou Wu
Journal of Electronic Imaging | VOL. 32
Mingjie Li, et. al.Mingjie Li ... Hanzhou Wu
28 Mar 2023
Journal of Electronic Imaging | VOL. 32

Restoration of Adversarial Examples Using Image Arithmetic Operations
Kazim Ali ... Adnan N Quershi
Intelligent Automation & Soft Computing | VOL. 32
Kazim Ali, et. al.Kazim Ali ... Adnan N Quershi
01 Jan 2021
Intelligent Automation & Soft Computing | VOL. 32

Adversarial Examples in Deep Neural Networks: An Overview
Emilio Rafael Balda ... Rudolf Mathar
-
Emilio Rafael Balda, et. al.Emilio Rafael Balda ... Rudolf Mathar
24 Oct 2019
24 Oct 2019

FCDM: A Methodology Based on Sensor Pattern Noise Fingerprinting for Fast Confidence Detection to Adversarial Attacks
Yazhu Lan ... Guohe Zhang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 39
Yazhu Lan, et. al.Yazhu Lan ... Guohe Zhang
31 Jan 2020
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An auxiliary correction model implemented by the Correction Property of intermediate layer against adversarial examples

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing