Abstract

Previous studies suggested that lateral interactions of V1 cells are responsible, among other visual effects, of bottom-up visual attention (alternatively named visual salience or saliency). Our objective is to mimic these connections with a neurodynamic network of firing-rate neurons in order to predict visual attention. Early visual subcortical processes (i.e. retinal and thalamic) are functionally simulated. An implementation of the cortical magnification function is included to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return, oculomotor and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search tasks. Results show that our model outpeforms other biologically inspired models of saliency prediction while predicting visual saccade sequences with the same model. We also show how temporal and spatial characteristics of saccade amplitude and inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) can predict attention at distinct image contexts.

Highlights

  • The human visual system (HVS) structure has evolved in a way to efficiently discriminate redundant information [1, 2, 3]

  • Presented a computational implementation of the aforementioned framework (IKN), inspired by the early mechanisms of the HVS. It was done by extracting properties of the image as feature maps, obtaining feature-wise conspicuity by computing center-surround differences as receptive field responses and integrating them on a unique map using winner-take-all mechanisms. Such framework served as a starting point for saliency modeling [8, 9], which derived in a myriad of computational models, that differed in their computations but conserved a similar pipeline

  • In this study we have presented a biologically-plausible model of visual attention by mimicking visual mechanisms from retina to V1 using real images

Read more

Summary

Introduction

The human visual system (HVS) structure has evolved in a way to efficiently discriminate redundant information [1, 2, 3]. Presented a computational implementation of the aforementioned framework (IKN), inspired by the early mechanisms of the HVS It was done by extracting properties of the image as feature maps (using a pyramid of difference-of-gaussian filters at distinct orientations, color and intensity), obtaining feature-wise conspicuity by computing center-surround differences as receptive field responses and integrating them on a unique map using winner-take-all mechanisms. Such framework served as a starting point for saliency modeling [8, 9], which derived in a myriad of computational models, that differed in their computations but conserved a similar pipeline. These representations were found to be remarkably similar to cells in V1, which follow similar spatial properties to Gabor filters [11]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call