Abstract

Any visual sensor, whether artificial or biological, maps the 3D-world on a 2D-representation. The missing dimension is depth and most species use stereo vision to recover it. Stereo vision implies multiple perspectives and matching, hence it obtains depth from a pair of images. Algorithms for stereo vision are also used prosperously in robotics. Although, biological systems seem to compute disparities effortless, artificial methods suffer from high energy demands and latency. The crucial part is the correspondence problem; finding the matching points of two images. The development of event-based cameras, inspired by the retina, enables the exploitation of an additional physical constraint—time. Due to their asynchronous course of operation, considering the precise occurrence of spikes, Spiking Neural Networks take advantage of this constraint. In this work, we investigate sensors and algorithms for event-based stereo vision leading to more biologically plausible robots. Hereby, we focus mainly on binocular stereo vision.

Highlights

  • As the visual sense and any visual sensor loose one dimension when mapping the 3D-world onto a 2D-representation, the ability to recover depth is crucial for biological and artificial vision systems

  • Compared to conventional synchronous methods, these models use a further constraint to suppress false correspondences, time (Dikov et al, 2017). This brings a great novelty to the old approach; the network input is not composed of static images but instead spike-trains containing spatio-temporal information

  • This is a simple yet elegant solution, whereby the camera alters its focal distance in a steady manner and a Spiking Neural Networks (SNN) computes the time of the best focus for each pixel, creating a depth map

Read more

Summary

INTRODUCTION

As the visual sense and any visual sensor loose one dimension when mapping the 3D-world onto a 2D-representation, the ability to recover depth is crucial for biological and artificial vision systems. Stereo-vision refers to the method recovering depth information from both eyes, or in the artificial context, two sensors In biology this is possible due to the laterally shifted eyes, gaining slightly different versions of a scene. While biology computes disparities seemingly effortless, current approaches computing stereo in real-time are too computationally expensive This is mainly caused by acquiring and processing huge amounts of redundant data. Examples for event-based stereo vision applications applying networks with spiking neurons are Dikov et al (2017), Osswald et al (2017), Rebecq et al (2017), and Haessig et al (2019). As stereo vision is a large topic many different techniques such as radar, ultrasonic sensors, light section, structured light, and depth from defocus/focus exist This manuscript is mainly focusing on binocular stereo vision. The evolution and a comparison of event-based sensors is presented, followed by an investigation of cooperative algorithms and their alternatives for event-driven stereo vision

TECHNICAL AND BIOLOGICAL BACKGROUND
Conventional Cameras and Their Principle of Operation
Depth Perception in Machine Vision
The Retina
Biological Depth Perception
EVENT-BASED VISUAL SENSORS
The Silicon Retina—Emergence and Fundamentals
Address Event Representation
Comparison of the Best-Known Exponents
Additional Models of Event-Based Sensors
EVENT-DRIVEN STEREOSCOPY
Cooperative Algorithms
Extensions of Cooperative Algorithms
Alternatives to Cooperative Algorithms
CONCLUSION
Remaining Issues of Silicon Retinas
Artificial Stereoscopy—A Comparison
Findings
Outlook
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call