Abstract

The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

Highlights

  • We maintain a stable perception of our environment the brightness patterns reaching our eyes undergo substantial changes

  • We have described how the shapes of receptive field profiles in the early visual pathway can be constrained from structural symmetry properties of the environment, which include the requirement that the receptive field responses should be sufficiently well-behaved under basic image transformations

  • We have shown how these covariance properties of receptive fields enable true invariance properties of visual processes at the systems level, if combined with max-like operations over the output of receptive field families tuned to different filter parameters

Read more

Summary

Introduction

We maintain a stable perception of our environment the brightness patterns reaching our eyes undergo substantial changes. The following structural requirements can be imposed motivated by the special nature of time and space-time: Galilean covariance For time-dependent spatio-temporal image data, we may have relative motions between objects in the world and the observer, where a constant velocity translational motion can be modelled by a Galilean transformation f 0~Gv f ð39Þ corresponding to f 0(x0,t0)~f (x,t) with x0~xzv t: ð40Þ. Other time-causal temporal scale-space models have been proposed by Koenderink [83] based on a logarithmic transformation of time in relation to a time delay relative to the present moment and by Lindeberg and Fagerstrom [84] based on a set of first-order integrators corresponding to truncated exponential filters with time constants mi coupled in cascade hcomposed (t; m)~ Such first-order temporal integrators satisfy weaker scale-space properties in the sense of guaranteeing non-creation of local extrema or zero-crossings for a one-dimensional temporal signal, they do not permit true covariance under rescalings of the temporal axis. For spatio-temporal receptive fields Lxatb L that involve explicit temporal derivatives with bw0, it will disappear altogether, since the vignetting only depends upon the spatial coordinates

Findings
Summary and conclusions
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.