Mitigating Black-Box Adversarial Attacks via Output Noise Perturbation

Manjushree B Aithal,Xiaohua Li

doi:10.1109/access.2022.3146198

Abstract

In black-box adversarial attacks, attackers query the deep neural network (DNN) and use the query results to optimize the adversarial samples iteratively. In this paper, we study the method of adding white noise to the DNN output to mitigate such attacks. One of our unique contributions is a theoretical analysis of gradient signal-to-noise ratio (SNR), which shows the trade-off between the defense noise level and the attack query cost. The attacker’s query count (QC) is derived mathematically as a function of noise standard deviation. This will guide the defender to find the appropriate noise level for mitigating attacks to the desired security level specified by QC and DNN performance loss. Our analysis shows that the added noise is drastically magnified by the small variation of DNN outputs, which makes the reconstructed gradient have an extremely low SNR. Adding slight white noise with a very small standard deviation, e.g., less than 0.01, is enough to increase QC by many orders of magnitude yet without introducing any noticeable classification accuracy reduction. Our experiments demonstrate that this method can effectively mitigate both soft-label and hard-label black-box attacks under realistic QC constraints. We also prove that this method outperforms many other defense methods and is robust to the attacker’s countermeasures.

Highlights

Along with the rapid development of deep neural networks (DNNs), there are a lot of online services, such as Clarifai API, Google Photos, advertisement detection and fake news filtering, etc., that highly rely on DNNs
An intriguing issue is that DNNs are highly susceptible to small variations in input data [1]
The former assumes that the attackers have complete knowledge of the deep network, while the latter assumes that the attackers have limited knowledge, typically some output information of the DNNs

Summary

Introduction

Along with the rapid development of deep neural networks (DNNs), there are a lot of online services, such as Clarifai API, Google Photos, advertisement detection and fake news filtering, etc., that highly rely on DNNs. Online DNN servers suffer from adversarial attacks where the attackers can slightly change the input data to make DNNs give false results or misclassification [2]. Depending on the knowledge about the DNNs that the attackers have, adversarial attacks can be classified into white-box attacks [1], [3]–[5] and black-box attacks [6]– [13]. The former assumes that the attackers have complete knowledge of the deep network, while the latter assumes that the attackers have limited knowledge, typically some output information of the DNNs. Compared with white-box attacks, black-box attacks are more realistic threats to realworld practical applications

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Mitigating Black-Box Adversarial Attacks via Output Noise Perturbation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Universal BlackMarks: Key-Image-Free Blackbox Multi-Bit Watermarking of Deep Neural Networks
Li Li ... Weiming Zhang
IEEE Signal Processing Letters | VOL. 30
Li Li, et. al.Li Li ... Weiming Zhang
01 Jan 2023
IEEE Signal Processing Letters | VOL. 30

GreedyFool: Multi-factor imperceptibility and its application to designing a black-box adversarial attack
Hui Liu ... Peng Liu
Information Sciences | VOL. 613
Hui Liu, et. al.Hui Liu ... Peng Liu
12 Aug 2022
Information Sciences | VOL. 613

A Proposal of an End-to-End DoA Estimation System Aided by Deep Learning
Daniel Akira Ando ... Junichiro Hagiwara
-
Daniel Akira Ando, et. al.Daniel Akira Ando ... Junichiro Hagiwara
30 Oct 2022
30 Oct 2022

Deep image prior based defense against adversarial examples
Tao Dai ... Shu-Tao Xia
Pattern Recognition | VOL. 122
Tao Dai, et. al.Tao Dai ... Shu-Tao Xia
07 Sep 2021
Pattern Recognition | VOL. 122

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mitigating Black-Box Adversarial Attacks via Output Noise Perturbation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access