Abstract

Black-box attacks against deep neural network (DNN) classifiers are receiving increasing attention because they represent a more practical approach in the real world than white box attacks. In black-box environments, adversaries have limited knowledge regarding the target model. This makes it difficult to estimate gradients for crafting adversarial examples, such that powerful white-box algorithms cannot be directly applied to black-box attacks. Therefore, a well-known black-box attack strategy creates local DNNs, called substitute models, to emulate the target model. The adversaries then craft adversarial examples using the substitute models instead of the unknown target model. The substitute models repeat the query process and are trained by observing labels from the target model’s responses to queries. However, emulating a target model usually requires numerous queries because new DNNs are trained from the beginning. In this study, we propose a new training method for substitute models to minimize the number of queries. We consider the number of queries as an important factor for practical black-box attacks because real-world systems often restrict queries for security and financial purposes. To decrease the number of queries, the proposed method does not emulate the entire target model and only adjusts the partial classification boundary based on a current attack. Furthermore, it does not use queries in the pre-training phase and creates queries only in the retraining phase. The experimental results indicate that the proposed method is effective in terms of the number of queries and attack success ratio against MNIST, VGGFace2, and ImageNet classifiers in query-limited black-box environments. Further, we demonstrate a black-box attack against a commercial classifier, Google AutoML Vision.

Highlights

  • Deep neural network (DNN) classifiers have made significant progress in many domains such as image classification [1,2], voice recognition [3,4], malware detection [5,6], and natural language processing [7]

  • This study proposes a new method for training the substitute model with the purpose of decreasing the number of queries

  • Because of query-limited services and systems present in the real world, minimizing the number of queries is an important factor in practical black-box attacks

Read more

Summary

Introduction

Deep neural network (DNN) classifiers have made significant progress in many domains such as image classification [1,2], voice recognition [3,4], malware detection [5,6], and natural language processing [7]. Despite their great success, recent studies have demonstrated that DNNs are vulnerable to well-designed input samples called adversarial examples [8,9]. Manipulated traffic signs can confuse autonomous vehicles [10,11] and adversarial voices can deceive automatic voice recognition models [12,13] such as Apple’s Siri and Amazon’s Alexa

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call