Abstract

Bayesian neural networks (BNNs) are used in many tasks because they provide a probabilistic representation of deep learning models by placing a distribution over the model parameters. Although BNNs are a more robust deep learning paradigm than vanilla deep neural networks, their ability to handle adversarial attacks in practice remains limited. In this study, we propose a novel multi-task adversarial training approach for improving the adversarial robustness of BNNs. Specifically, we first generate diverse and stronger adversarial examples for adversarial training by maximising a multi-task loss. This multi-task loss is a combination of the unsupervised feature scattering loss and supervised margin loss. Then, we find the model parameters by minimising another multi-task loss composed of the feature loss and variational inference loss. The feature loss is defined based on distance ℓp, which measures the difference between the two feature representations extracted from the clean and adversarial examples. Minimising the feature loss improves the feature similarity and helps the model learn more robust features, resulting in enhanced robustness. Extensive experiments are conducted on four benchmark datasets in white-box and black-box attack scenarios. The experimental results demonstrate that the proposed approach significantly improves the adversarial robustness compared with several state-of-the-art defence methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.