Abstract
Deep neural networks (DNNs) are vulnerable to adversarial examples. Even under black-box setting that is without access to the target model, transfer-based attacks can easily fool the DNNs. To alleviate this problem, we propose a robust classification model against transfer attacks based on the framework of variational Auto-Encoders (VAEs) which are probabilistic generative models and have been successfully used to a large mount of tasks. Specifically, our model simulates the data generative process with several multivariate Gaussian distributions and DNNs: (1) We assume that the latent embedding generated by an encoder (a DNN) of each category corresponds to a multivariate Gaussian distribution. (2) A decoder (a DNN) is proposed to decodes the latent embedding into an observable. (3) Theoretical analysis illustrates that our model can predict data’s labels by maximizing the lower bound on the log-likelihood for each category utilizing Bayes’ theorem with excellent robustness against transfer attacks. Inference in our model is done in a variational way so the Stochastic Gradient Variational Bayes (SGVB) estimator and reparamerization trick can be utilized to optimize the evidence lower bound (ELBO). The experiments with quantitative comparisons show that our approach reaches state-of-the-art with significantly better robustness.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.