Purifying Adversarial Images Using Adversarial Autoencoder With Conditional Normalizing Flows

Yi Ji,Isao Echizen,Trung-Nghia Le,Huy H Nguyen

doi:10.1109/ojsp.2023.3275053

Yi Ji, Isao Echizen + Show 2 more

Open Access

PDF Available

https://doi.org/10.1109/ojsp.2023.3275053

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

We present a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">target-agnostic</i> adversarial autoencoder with conditional normalizing flows specifically designed to, given any <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">unlabeled</i> image dataset, purify adversarial samples into clean images, i.e., remove adversarial noise from the images while preserving their visual quality. In our model interpretation, samples are processed by manifold projection in which the encoder brings the sample back into a posterior data distribution in latent space so that the sample is less likely to be irregular to the learned representation of any target classifier. Normalizing flows conditioned on top of our hybrid network structure and walk-back training are used to deal with common drawbacks of generative model and autoencoder-based approaches: not only the trade-off between compression loss and over-fitting on training data but also the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">structural model dependency</i> on dataset classes and labels. Experiments demonstrated that our proposed model is preferable to existing target-agnostic adversarial defense methods particularly for large and unlabeled image datasets.

Full Text