Exploring CNN features in the context of adversarial robustness and human perception

Enzo Casamassima,Andrew Herbert,Cory Merkel,Tarek M Taha,Jonathan Howe,Michael E Zelinski

doi:10.1117/12.2594363

Abstract

Recent studies in the field of adversarial machine learning have highlighted the poor robustness of convolutional neural networks (CNNs) to small, carefully crafted variations of the inputs. Previous work in this area has largely been focused on very small image perturbations and how these completely throw off the classifier output and cause CNNs to make high-confidence misclassifications while leaving the image visually unchanged for a human observer. These attacks modify individual pixels of each image and are unlikely to exist in a natural environment. More recent work has demonstrated that CNNs are also vulnerable to simple transformations of the input image, such as rotations and translations. These ‘natural’ transformations are much more likely to occur, either accidentally or intentionally, in a real-world scenario. In fact, humans experience and successfully recognize countless objects under these types of transformations every day. In this paper, we study the effect of these transformations on CNN accuracy when classifying 3D face-like objects (Greebles). Furthermore, we visualize the learned feature representations by CNNs and analyze how robust these learned representations are and how they compare to the human visual system. This work serves as a basis for future research into understanding the differences between CNN and human object recognition, particularly in the context of adversarial examples.

Full Text