Abstract
In daily visual experiences, the human visual system extracts functionally meaningful features from the visual environment to perform necessary cognitive tasks. How does visual attention operate in such complex environments? Would conventional attention theories, such as feature integration theory (FIT) and guided search (GS), apply to such scene features? These theories provide a framework for how selective attention parses visual input into basic features and binds those features into integral percepts. This theoretical framework so far been tested mainly with basic, localized features, such as colour and orientation. Here, we investigate to what extent the FIT and GS framework generalizes to ecologically valid scene features. We conducted a series of visual search experiments in which participants searched for a target scene among distractor scenes. These scenes were generated within a two-dimensional parametric space of high-level scene features, such as indoor lighting, scene layout, or surface texture. We sampled target and distractor scenes from this space in such a way that we could compare feature and conjunction search behaviours. Visual search performance across different set sizes showed that 1) search was never efficient, both feature and conjunction search conditions exhibited set size effects, but 2) feature search was significantly more efficient than conjunction search. Given these results, we propose that real-world scene features are not preattentive, requiring selective attention for successful visual search. However, these features still meaningfully guide attention in a manner consistent with GS.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have