Detection of adversarial attacks based on differences in image entropy

Gwonsang Ryu,Daeseon Choi

doi:10.1007/s10207-023-00735-6

Gwonsang Ryu, Daeseon Choi

Open Access

https://doi.org/10.1007/s10207-023-00735-6

Copy DOI

Abstract

AbstractAlthough deep neural networks (DNNs) have achieved high performance across various applications, they are often deceived by adversarial examples generated by adding small perturbations. To combat adversarial attacks, many detection methods have been proposed, including feature squeezing and trapdoor. However, these methods rely on the output of DNNs or involve training a separate network to detect adversarial examples, which leads to high computational costs and low efficiency. In this study, we propose a simple and effective approach called the entropy-based detector (EBD) to protect DNNs from various adversarial attacks. EBD detects adversarial examples by comparing the difference in entropy between the input sample before and after bit depth reduction. We show that EBD can detect over 98% of the adversarial examples generated by attacks using fast-gradient sign method, basic iterative method, momentum iterative method, DeepFool and CW attacks when the false positive rate is 2.5% for CIFAR-10 and ImageNet datasets.

Full Text