A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation

Hao Fu,Akshaj Kumar Veldanda,Farshad Khorrami,Siddharth Garg,Prashanth Krishnamurthy

doi:10.1109/access.2022.3141077

Abstract

This paper proposes a new defense against neural network backdooring attacks that are maliciously trained to mispredict in the presence of attacker-chosen triggers. Our defense is based on the intuition that the feature extraction layers of a backdoored network embed new features to detect the presence of a trigger and the subsequent classification layers learn to mispredict when triggers are detected. Therefore, to detect backdoors, the proposed defense uses two synergistic anomaly detectors trained on clean validation data: the first is a novelty detector that checks for anomalous features, while the second detects anomalous mappings from features to outputs by comparing with a separate classifier trained on validation data. The approach is evaluated on a wide range of backdoored networks (with multiple variations of triggers) that successfully evade state-of-the-art defenses. Additionally, we evaluate the robustness of our approach on imperceptible perturbations, scalability on large-scale datasets, and effectiveness under domain shift. This paper also shows that the defense can be further improved using data augmentation.

Highlights

D EEP neural networks (DNN) are widely used in various applications including object detection (Ren et al [1]), face recognition (Sun et al [2], Taigman et al [3]), natural language processing (Collobert et al [4], Bahdanau et al [5]), self-driving (Chen et al [6]), navigation (Fu et al [7]), surveillance (Osia et al [8]), and cyber-physical systems security (Patel et al [9, 10, 11])
We propose Removing AdversarialBackdoors by Iterative Demarcation (RAID), which (1) makes minimal assumptions on the backdoor operation, (2) is effective with small clean validation data, (3) is computationally efficient and can be updated in real-time during on-line operation, (4) and has consistently good performance on several datasets under different backdoors
We show the efficacy of our approach under various conditions, including multitriggers and adaptive attacks, imperceptible trigger, largescaled dataset, various attack densities, and various update frequencies

Summary

Introduction

D EEP neural networks (DNN) are widely used in various applications including object detection (Ren et al [1]), face recognition (Sun et al [2], Taigman et al [3]), natural language processing (Collobert et al [4], Bahdanau et al [5]), self-driving (Chen et al [6]), navigation (Fu et al [7]), surveillance (Osia et al [8]), and cyber-physical systems security (Patel et al [9, 10, 11]). Training time attacks (Chen et al [20], Gu et al [21]) are drawing increasing attention. This is because that training DNNs is challenging, especially for individuals or small entities, due to difficulties in obtaining large high-quality labeled datasets and the cost of maintaining or renting computational resources needed to train a complex model (Esteva et al [22], Roh et al [23], Halevy et al [24]). Scenario: The user wishes to train a DNN F for a classification task on the training dataset S sampled from the data distribution D. The user outsources the training task to a third party (attacker). The attacker neither has access to the user’s validation dataset, nor the ability to change the model structure after training

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Cognitive data augmentation for adversarial defense via pixel masking
Akshay Agarwal ... Nalini Ratha
Pattern Recognition Letters | VOL. 146
Akshay Agarwal, et. al.Akshay Agarwal ... Nalini Ratha
04 Feb 2021
Pattern Recognition Letters | VOL. 146

S-CUDA: Self-cleansing unsupervised domain adaptation for medical image segmentation.
Luyan Liu ... Zhengdong Zhang
Medical Image Analysis | VOL. 74
Luyan Liu, et. al.Luyan Liu ... Zhengdong Zhang
01 Dec 2021
Medical Image Analysis | VOL. 74

Domain Shifts in Machine Learning Based Covid-19 Diagnosis From Blood Tests
Theresa Roland ... Sepp Hochreiter
Journal of Medical Systems | VOL. 46
Theresa Roland, et. al.Theresa Roland ... Sepp Hochreiter
29 Mar 2022
Journal of Medical Systems | VOL. 46

EXPLORING CROSS-CITY SEMANTIC SEGMENTATION OF ALS POINT CLOUDS
Y Xie ... K Schindler
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLIII-B2-2021
Y Xie, et. al.Y Xie ... K Schindler
28 Jun 2021
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLIII-B2-2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access