Abstract

Rotations in depth are challenging for object vision because features can appear, disappear, be stretched or compressed. Yet we easily recognize objects across views. Are the underlying representations view invariant or dependent? This question has been intensely debated in human vision, but the neuronal representations remain poorly understood. Here, we show that for naturalistic objects, neurons in the monkey inferotemporal (IT) cortex undergo a dynamic transition in time, whereby they are initially sensitive to viewpoint and later encode view-invariant object identity. This transition depended on two aspects of object structure: it was strongest when objects foreshortened strongly across views and were similar to each other. View invariance in IT neurons was present even when objects were reduced to silhouettes, suggesting that it can arise through similarity between external contours of objects across views. Our results elucidate the viewpoint debate by showing that view invariance arises dynamically in IT neurons out of a representation that is initially view dependent.

Highlights

  • OBJECT VISION IS CHALLENGING because images of an object vary widely with the viewing conditions (Dicarlo et al 2012; Logothetis and Sheinberg 1996; Pinto et al 2008; Poggio and Ullman 2013; Tanaka 1996)

  • It is important to compare the view invariance measured in IT neurons with that expected from the image itself

  • We characterized the dynamics of 3D view invariance in monkey IT neurons for naturalistic objects

Read more

Summary

Introduction

OBJECT VISION IS CHALLENGING because images of an object vary widely with the viewing conditions (Dicarlo et al 2012; Logothetis and Sheinberg 1996; Pinto et al 2008; Poggio and Ullman 2013; Tanaka 1996). IT neurons respond in a view-invariant manner to familiarized natural objects (Booth and Rolls 1998) and faces (Eifuku et al 2004, 2011; Freiwald and Tsao 2010; Perrett et al 1991) These disparate findings can be reconciled by the fact that the first set of studies used artificial objects, such as wireframes, whose images vary drastically with viewpoint, whereas the second set of studies used natural objects whose images do not vary greatly with viewpoint. Our results suggest a coarse-to-fine processing scheme in IT, wherein neurons are initially sensitive to coarse, low-level differences between views and only later become invariant to object identity independent of view

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call