Improving adversarial robustness by learning shared information

Xi Yu,Niklas Smedemark-Margulies,Shuchin Aeron,Toshiaki Koike-Akino,Pierre Moulin,Matthew Brand,Kieran Parsons,Ye Wang

doi:10.1016/j.patcog.2022.109054

Abstract

We consider the problem of improving the adversarial robustness of neural networks while retaining natural accuracy. Motivated by the multi-view information bottleneck formalism, we seek to learn a representation that captures the shared information between clean samples and their corresponding adversarial samples while discarding these samples’ view-specific information. We show that this approach leads to a novel multi-objective loss function, and we provide mathematical motivation for its components towards improving the robust vs. natural accuracy tradeoff. We demonstrate enhanced tradeoff compared to current state-of-the-art methods with extensive evaluation on various benchmark image datasets and architectures. Ablation studies indicate that learning shared representations is key to improving performance.

Full Text