Abstract

A biologically inspired model architecture for inferring 3D shape from texture is proposed. The model is hierarchically organized into modules roughly corresponding to visual cortical areas in the ventral stream. Initial orientation selective filtering decomposes the input into low-level orientation and spatial frequency representations. Grouping of spatially anisotropic orientation responses builds sketch-like representations of surface shape. Gradients in orientation fields and subsequent integration infers local surface geometry and globally consistent 3D depth. From the distributions in orientation responses summed in frequency, an estimate of the tilt and slant of the local surface can be obtained. The model suggests how 3D shape can be inferred from texture patterns and their image appearance in a hierarchically organized processing cascade along the cortical ventral stream. The proposed model integrates oriented texture gradient information that is encoded in distributed maps of orientation-frequency representations. The texture energy gradient information is defined by changes in the grouped summed normalized orientation-frequency response activity extracted from the textured object image. This activity is integrated by directed fields to generate a 3D shape representation of a complex object with depth ordering proportional to the fields output, with higher activity denoting larger distance in relative depth away from the viewer.

Highlights

  • The construction of a neural representation of the 3D shape structure of an object from the monocular 2D information available from the retinal image, is one of the challenging tasks of biological visual systems

  • In order to demonstrate the functionality of the new proposed model architecture we show below several results

  • The model produces two types of final output results: A 2D sketch showing object boundaries and occlusion borders; and a 3D mesh with values indicating the relative depth of the surface location, with low activity response corresponding to the surface being closer to the viewer and farther for higher values

Read more

Summary

Introduction

The construction of a neural representation of the 3D shape structure of an object from the monocular 2D information available from the retinal image, is one of the challenging tasks of biological visual systems. The representation of depth structure can be computed from various visual cues such as binocular disparity, kinetic motion and texture gradients. Depth related information can be extracted from a single monocular image from distortions caused on the texture by the object structure and distance of the surface from the camera. The causes for the distortions vary from the type of material of the object, the type of projection, depth differences and the slant and tilt of the surface region. The visual system uses neural sensitivity to gradients of such distortions present in the distribution of neural responses

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.