Assessing Neural Network Scene Classification from Degraded Images

Timothy Tadros,Emily A Cooper,Michelle R Greene,Nicholas C Cullen

doi:10.1145/3342349

Abstract

Scene recognition is an essential component of both machine and biological vision. Recent advances in computer vision using deep convolutional neural networks (CNNs) have demonstrated impressive sophistication in scene recognition, through training on large datasets of labeled scene images (Zhou et al. 2018, 2014). One criticism of CNN-based approaches is that performance may not generalize well beyond the training image set (Torralba and Efros 2011), and may be hampered by minor image modifications, which in some cases are barely perceptible to the human eye (Goodfellow et al. 2015; Szegedy et al. 2013). While these “adversarial examples” may be unlikely in natural contexts, during many real-world visual tasks scene information can be degraded or limited due to defocus blur, camera motion, sensor noise, or occluding objects. Here, we quantify the impact of several image degradations (some common, and some more exotic) on indoor/outdoor scene classification using CNNs. For comparison, we use human observers as a benchmark, and also evaluate performance against classifiers using limited, manually selected descriptors. While the CNNs outperformed the other classifiers and rivaled human accuracy for intact images, our results show that their classification accuracy is more affected by image degradations than human observers. On a practical level, however, accuracy of the CNNs remained well above chance for a wide range of image manipulations that disrupted both local and global image statistics. We also examine the level of image-by-image agreement with human observers, and find that the CNNs’ agreement with observers varied as a function of the nature of image manipulation. In many cases, this agreement was not substantially different from the level one would expect to observe for two independent classifiers. Together, these results suggest that CNN-based scene classification techniques are relatively robust to several image degradations. However, the pattern of classifications obtained for ambiguous images does not appear to closely reflect the strategies employed by human observers.

Highlights

Recognizing the type of scene depicted in an image or video provides key contextual information with which other visual content—such as objects, actions, and people—can be disambiguated, recognized, and interpreted (Greene and Oliva 2009b; Groen et al 2017)
The HSV-LDA classifier’s accuracy was not substantially different between the original and degraded images. This is likely because most of the manipulations altered the statistical features of the images captured by the neural networks and GIST, but did not largely affect the global image hue, saturation, and value
5 DISCUSSION Deep convolutional neural networks (CNNs) have emerged as a computer vision tool both for practical applications and for modeling biological visual processing

Summary

Introduction

Recognizing the type of scene depicted in an image or video provides key contextual information with which other visual content—such as objects, actions, and people—can be disambiguated, recognized, and interpreted (Greene and Oliva 2009b; Groen et al 2017). A related study compared a bag-of-words classifier with people’s performance on images with scrambled and missing pixel blocks (Parikh 2011) This model performed to people on an outdoor dataset and worse than people on an indoor dataset, mirroring the results of the larger-scale study. While this prior work provides important insights into the type of image information that may be useful for scene classification (local versus global), these studies did not directly address classifier accuracy with degradations that are likely to occur under real-world conditions, nor did they include the more recently developed CNN approaches

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Transactions on Applied Perception	Publication Date: Sep 23, 2019
Citations: 18	License type: cc-by

R Discovery Prime

R Discovery Prime

Assessing Neural Network Scene Classification from Degraded Images

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM Transactions on Applied Perception

Lead the way for us

Similar Papers

Author response: Invariant representation of physical stability in the human brain
RT Pramod ... Joshua B Tenenbaum
-
RT Pramod, et. al.RT Pramod ... Joshua B Tenenbaum
09 Feb 2022
09 Feb 2022

Attack Selectivity of Adversarial Examples in Remote Sensing Image Scene Classification
Li Chen ... Jiawei Zhu
IEEE access : practical innovations, open solutions | VOL. 8
Li Chen, et. al.Li Chen ... Jiawei Zhu
01 Jan 2020
IEEE access : practical innovations, open solutions | VOL. 8

Inter-dependent CNNs for joint scene and object recognition
Jawadul Hasan Bappy ... Amit K Roy-Chowdhury
-
Jawadul Hasan Bappy, et. al.Jawadul Hasan Bappy ... Amit K Roy-Chowdhury
01 Dec 2016
01 Dec 2016

A failure to learn object shape geometry: Implications for convolutional neural networks as plausible models of biological vision
Dietmar Heinke ... E Charles Leek
Vision research | VOL. 189
Dietmar Heinke, et. al.Dietmar Heinke ... E Charles Leek
08 Oct 2021
Vision research | VOL. 189

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing Neural Network Scene Classification from Degraded Images

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM Transactions on Applied Perception