Abstract

How do we decide which objects in a visual scene are more interesting? While intuition may point toward high-level object recognition and cognitive processes, here we investigate the contributions of a much simpler process, low-level visual saliency. We used the LabelMe database (24,863 photographs with 74,454 manually outlined objects) to evaluate how often interesting objects were among the few most salient locations predicted by a computational model of bottom-up attention. In 43% of all images the model's predicted most salient location falls within a labeled region (chance 21%). Furthermore, in 76% of the images (chance 43%), one or more of the top three salient locations fell on an outlined object, with performance leveling off after six predicted locations. The bottom-up attention model has neither notion of object nor notion of semantic relevance. Hence, our results indicate that selecting interesting objects in a scene is largely constrained by low-level visual properties rather than solely determined by higher cognitive processes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.