Abstract
It is well known that simple visual tasks, such as object detection or categorization, can be performed within a short period of time, suggesting the sufficiency of feed-forward visual processing. However, more complex visual tasks, such as fine-grained localization may require high-resolution information available at the early processing levels in the visual hierarchy. To access this information using a top-down approach, feedback processing would need to traverse several stages in the visual hierarchy and each step in this traversal takes processing time. In the present study, we compared the processing time required to complete object categorization and localization by varying presentation duration and complexity of natural scene stimuli. We hypothesized that performance would be asymptotic at shorter presentation durations when feed-forward processing suffices for visual tasks, whereas performance would gradually improve as images are presented longer if the tasks rely on feedback processing. In Experiment 1, where simple images were presented, both object categorization and localization performance sharply improved until 100 ms of presentation then it leveled off. These results are a replication of previously reported rapid categorization effects but they do not support the role of feedback processing in localization tasks, indicating that feed-forward processing enables coarse localization in relatively simple visual scenes. In Experiment 2, the same tasks were performed but more attention-demanding and ecologically valid images were used as stimuli. Unlike in Experiment 1, both object categorization performance and localization precision gradually improved as stimulus presentation duration became longer. This finding suggests that complex visual tasks that require visual scrutiny call for top-down feedback processing.
Highlights
The human visual system is known to be very rapid and efficient at analyzing some types of visual information
As many previous studies have demonstrated, the present study suggests that human vision can very rapidly determine the category of a certain object embedded in a visual scene, demonstrating that animal detection accuracy dramatically improved as stimulus presentation duration increased ~100 ms
Animal detection accuracy at longer presentation durations was dependent on scene complexity
Summary
The human visual system is known to be very rapid and efficient at analyzing some types of visual information. People can determine whether a briefly flashed image contains a depiction of a certain object category and categorization performance still holds even if another visual pattern immediately follows the target image by backward-masking or rapid serial visual presentation (RSVP) [1,2,3,4,5,6,7,8,9,10]. Studies using classifier-based readout techniques demonstrated that information about object category and identity can be decoded from human temporal cortex and macaque inferior temporal area (IT) as early as 100 ms after stimulus onset, suggesting that hierarchical feed-forward processing is sufficient for rapid object categorization [14,15,16]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.