Abstract

How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, from global gist to local textures, to learn and recognize scenic properties. The model can incrementally learn and rapidly predict scene identity by gist information alone, and then accumulate learned evidence from scenic textures to refine this hypothesis. The model shows how texture-fitting allocations of spatial attention, called attentional shrouds, can facilitate scene recognition, particularly when they include a border of adjacent textures. Using grid gist plus three shroud textures on a benchmark photograph dataset, ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain, and countryside) with up to 91.85% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.