Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

Tong Wang,Feng Xu,Ting Wang,Yuan Yao,Miao Xu,Shengwei An

doi:10.1609/aaai.v38i1.27780

Abstract

Backdoor attacks have been shown to be a serious security threat against deep learning models, and various defenses have been proposed to detect whether a model is backdoored or not. However, as indicated by a recent black-box attack, existing defenses can be easily bypassed by implanting the backdoor in the frequency domain. To this end, we propose a new defense DTInspector against black-box backdoor attacks, based on a new observation related to the prediction confidence of learning models. That is, to achieve a high attack success rate with a small amount of poisoned data, backdoor attacks usually render a model exhibiting statistically higher prediction confidences on the poisoned samples. We provide both theoretical and empirical evidence for the generality of this observation. DTInspector then carefully examines the prediction confidences of data samples, and decides the existence of backdoor using the shortcut nature of backdoor triggers. Extensive evaluations on six backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

One-to-N & N-to-One: Two Advanced Backdoor Attacks Against Deep Learning Models
Mingfu Xue ... Weiqiang Liu
IEEE Transactions on Dependable and Secure Computing | VOL. 19
Mingfu Xue, et. al.Mingfu Xue ... Weiqiang Liu
02 Oct 2020
IEEE Transactions on Dependable and Secure Computing | VOL. 19

B 3 : Backdoor Attacks against Black-box Machine Learning Models
Xueluan Gong ... Huayang Huang
ACM Transactions on Privacy and Security | VOL. 26
Xueluan Gong, et. al.Xueluan Gong ... Huayang Huang
08 Aug 2023
ACM Transactions on Privacy and Security | VOL. 26

Backdoor Attacks on Time Series: A Generative Approach
Yujing Jiang ... Sarah Monazam Erfani
-
Yujing Jiang, et. al.Yujing Jiang ... Sarah Monazam Erfani
01 Feb 2023
01 Feb 2023

Backdoor Attacks with Wavelet Embedding: Revealing and enhancing the insights of vulnerabilities in visual object detection models on transformers within digital twin systems
Mingkai Shen ... Ruwei Huang
Advanced Engineering Informatics | VOL. 60
Mingkai Shen, et. al.Mingkai Shen ... Ruwei Huang
09 Jan 2024
Advanced Engineering Informatics | VOL. 60

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence