Abstract
A computational explanation of how visual attention, interpretation of visual stimuli, and eye movements combine to produce visual behavior, seems elusive. Here, we focus on one component: how selection is accomplished for the next fixation. The popularity of saliency map models drives the inference that this is solved, but we argue otherwise. We provide arguments that a cluster of complementary, conspicuity representations drive selection, modulated by task goals and history, leading to a hybrid process that encompasses early and late attentional selection. This design is also constrained by the architectural characteristics of the visual processing pathways. These elements combine into a new strategy for computing fixation targets and a first simulation of its performance is presented. A sample video of this performance can be found by clicking on the "Supplementary Files" link under the "Article Tools" heading.
Highlights
Given our goal of a computational explanation of the relationship among visual attention, interpretation of visual stimuli and eye movements, it is natural to begin with a look at past efforts that may play the role of foundations
It is important to note at the outset that not all of the structure in Figure 10 has been implemented at this time; there is no implementation of task influences and we demonstrate free-viewing performance only, the attentional sample is not included nor is it needed for the example since we do not have a high resolution task, and the pursuit cues are not included, and again, not needed since the image is static
We have presented a novel view of the functional relationship among visual attention, interpretation of visual stimuli, and eye movements
Summary
A computational explanation of how visual attention, interpretation of visual stimuli, and eye movements combine to produce visual behavior, seems elusive. We provide arguments that a cluster of complementary, conspicuity representations drive selection, modulated by task goals and history, leading to a hybrid process that encompasses early and late attentional selection. This design is constrained by the architectural characteristics of the visual processing pathways. These elements combine into a new strategy for computing fixation targets and a first simulation of its performance is presented
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.