Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

Siyuan Cheng,Shiqing Ma,Xiangyu Zhang,Yingqi Liu

doi:10.1609/aaai.v35i2.16201

Abstract

Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data. The backdoor can be activated when a normal input is stamped with a certain pattern called trigger, causing misclassification. Many existing trojan attacks have their triggers being input space patches/objects (e.g., a polygon with solid color) or simple input transformations such as Instagram filters. These simple triggers are susceptible to recent backdoor detection algorithms. We propose a novel deep feature space trojan attack with five characteristics: effectiveness, stealthiness, controllability, robustness and reliance on deep features. We conduct extensive experiments on 9 image classifiers on various datasets including ImageNet to demonstrate these properties and show that our attack can evade state-of-the-art defense.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 48

Similar Papers

Design and Evaluation of a Multi-Domain Trojan Detection Method on Deep Neural Networks
Yansong Gao ... Surya Nepal
IEEE Transactions on Dependable and Secure Computing | VOL. 19
Yansong Gao, et. al.Yansong Gao ... Surya Nepal
02 Feb 2021
IEEE Transactions on Dependable and Secure Computing | VOL. 19

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases
Ren Wang ... Gaoyuan Zhang
-
Ren Wang, et. al.Ren Wang ... Gaoyuan Zhang
01 Jan 2020
01 Jan 2020

SanitAIs: Unsupervised Data Augmentation to Sanitize Trojaned Neural Networks
Kiran Karra ... Chace Ashcraft
-
Kiran Karra, et. al.Kiran Karra ... Chace Ashcraft
12 Sep 2022
12 Sep 2022

FriendNet Backdoor
Hyun Kwon ... Hyunsoo Yoon
-
Hyun Kwon, et. al.Hyun Kwon ... Hyunsoo Yoon
12 Jan 2020
12 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence