Multi-Scale Spatial Concatenations of Local Features in Natural Scenes and Scene Classification

Xiaoyuan Zhu,Zhiyong Yang,Chris I Baker

doi:10.1371/journal.pone.0076393

Xiaoyuan Zhu, Zhiyong Yang + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0076393

Copy DOI

Abstract

How does the visual system encode natural scenes? What are the basic structures of natural scenes? In current models of scene perception, there are two broad feature representations, global and local representations. Both representations are useful and have some successes; however, many observations on human scene perception seem to point to an intermediate-level representation.In this paper, we proposed natural scene structures, i.e., multi-scale spatial concatenations of local features, as an intermediate-level representation of natural scenes. To compile the natural scene structures, we first sampled a large number of multi-scale circular scene patches in a hexagonal configuration. We then performed independent component analysis on the patches and classified the independent components into a set of clusters using the K-means method. Finally, we obtained a set of natural scene structures, each of which is characterized by a set of dominant clusters of independent components.We examined a range of statistics of the natural scene structures, compiled from two widely used datasets of natural scenes, and modeled their spatial arrangements at larger spatial scales using adjacency matrices. We found that the natural scene structures include a full range of concatenations of visual features in natural scenes, and can be used to encode spatial information at various scales. We then selected a set of natural scene structures with high information, and used the occurring frequencies and the eigenvalues of the adjacency matrices to classify scenes in the datasets. We found that the performance of this model is comparable to or better than the state-of-the-art models on the two datasets. These results suggest that the natural scene structures are a useful intermediate-level representation of visual scenes for our understanding of natural scene perception.

Highlights

How does the visual system encode natural scenes? What are the basic structures of natural scenes and what are their statistics? These are important research topics in both human and computer vision [1,2,3,4,5,6,7,8,9]
The only limitations on the combinations are induced by the clustering procedures, which can be made looser or tighter depending on specific applications
The only limitations on the possible concatenations are induced by the clustering procedures, which can be made looser or tighter depending on specific applications

Summary

Introduction

How does the visual system encode natural scenes? What are the basic structures of natural scenes and what are their statistics? These are important research topics in both human and computer vision [1,2,3,4,5,6,7,8,9]. We know that humans can grasp the gist of complex natural scenes quickly and remember extraordinarily rich details in thousands of scenes viewed for a brief period [10,11,12]. In current models of scene perception such as scene classification, there are two broad feature representations, global representations and local representations Global representations such as GIST [8] and CENTRIST [9] encode structures of whole scenes and leave out local visual features and their spatial relationships at various scales. Local representations such as SIFT [13] and SURF [14] encode statistics of local features such as luminance gradients. Both representations are useful and have some successes, the above observations on human scene perception seem to point to a representation that lies in between local and global representations

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Sep 30, 2013
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multi-Scale Spatial Concatenations of Local Features in Natural Scenes and Scene Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Statistics of natural scene structures and scene categorization
Xin Chen ... Zhiyong Yong
BMC Neuroscience | VOL. 13
Xin Chen, et. al.Xin Chen ... Zhiyong Yong
01 Jul 2012
BMC Neuroscience | VOL. 13

Scene themes, natural scene structures, and spatial statistics for scene vision
Zhiyong Yang ... Jinhua Xu
Journal of Vision | VOL. 15
Zhiyong Yang, et. al.Zhiyong Yang ... Jinhua Xu
01 Sep 2015
Journal of Vision | VOL. 15

Emergence of visual saliency from natural scenes via context-mediated probability distributions coding.
Jinhua Xu ... Zhiyong Yang
PloS one | VOL. 5
Jinhua Xu, et. al.Jinhua Xu ... Zhiyong Yang
29 Dec 2010
PloS one | VOL. 5

The influence of natural scene dynamics on auditory cortical activity.
Chandramouli Chandrasekaran ... Charles H Brown
The Journal of neuroscience : the official journal of the Society for Neuroscience | VOL. 30
Chandramouli Chandrasekaran, et. al.Chandramouli Chandrasekaran ... Charles H Brown
20 Oct 2010
The Journal of neuroscience : the official journal of the Society for Neuroscience | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Scale Spatial Concatenations of Local Features in Natural Scenes and Scene Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE