Interpretability-Guided Defense Against Backdoor Attacks to Deep Neural Networks

Wei Jiang,Jinyu Zhan,Xiangyu Wen,Xupeng Wang,Ziwei Song

doi:10.1109/tcad.2021.3111123

Abstract

As an emerging threat to deep neural networks (DNNs), backdoor attacks have received increasing attentions due to the challenges posed by the lack of transparency inherent in DNNs. In this article, we develop an efficient algorithm from the interpretability of DNNs to defend against backdoor attacks to DNN models. To extract critical neurons, we deploy sets of control gates following neurons in layers, and the function of a DNN model can be interpreted as semantic sensitivities of neurons to input samples. A backdoor identification approach, derived from the activation frequency distribution on critical neurons, is proposed to reveal anomalies of particular neurons produced by backdoor attacks. Subsequently, a feasible and fine-grained pruning strategy is introduced to eliminate backdoors hidden in DNN models, without the need of retraining. Extensive experiments demonstrate that the proposed algorithm can identify and eliminate malicious backdoors efficiently in both single-target and multitarget scenarios with the performance of a DNN model retained to a large extent.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpretability-Guided Defense Against Backdoor Attacks to Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Aug 1, 2022
Citations: 9

Similar Papers

PTB: Robust physical backdoor attacks against deep neural networks in real world
Mingfu Xue ... Weiqiang Liu
Computers & Security | VOL. 118
Mingfu Xue, et. al.Mingfu Xue ... Weiqiang Liu
15 Apr 2022
Computers & Security | VOL. 118

Industrial software technology: R Mitchell (ed) Peter Peregrinus Ltd (on behalf of the Institution of Electrical Engineers), London, UK (1987) 289pp £37.00
R Veryard
Information and Software Technology | VOL. 31
R VeryardR Veryard
01 May 1989
Information and Software Technology | VOL. 31

A comparative evaluation of deep convolutional neural network and deep neural network-based land use/land cover classifications of mining regions using fused multi-sensor satellite data
Ajay Kumar ... Amit Kumar Gorai
Advances in Space Research | VOL. 72
Ajay Kumar, et. al.Ajay Kumar ... Amit Kumar Gorai
04 Sep 2023
Advances in Space Research | VOL. 72

One-to-N & N-to-One: Two Advanced Backdoor Attacks Against Deep Learning Models
Mingfu Xue ... Weiqiang Liu
IEEE Transactions on Dependable and Secure Computing | VOL. 19
Mingfu Xue, et. al.Mingfu Xue ... Weiqiang Liu
02 Oct 2020
IEEE Transactions on Dependable and Secure Computing | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretability-Guided Defense Against Backdoor Attacks to Deep Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems