Abstract
Current CNN-based stereo depth estimation models can barely run under real-time constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art evaluations usually do not consider model optimization techniques, being that it is unknown what is the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models on three different embedded GPU devices, with and without optimization methods, presenting performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically augmenting the runtime speed of current models. In our experiments, we achieve real-time inference speed, in the range of 5–32 ms, for 1216 × 368 input stereo images on the Jetson TX2, Jetson Xavier, and Jetson Nano embedded devices.
Highlights
Depth estimation from stereo cameras is an essential cue for many robotic applications
We evaluate the two fastest models that can run in real-time, according to the literature, on one or more embedded graphic processing unit (GPU) devices
Our proposed model achieves 11.2% of disparity pixel error and can even run on real-time speed on the Jetson Nano embedded GPU device
Summary
Depth estimation from stereo cameras is an essential cue for many robotic applications, (e.g., object manipulation [1], obstacle avoidance [2], and 3D object detection [3]). It gives, for example, a robot the ability to perceive and interact with 3D objects and environments, which is critical for real-world applications. Depth estimation from a stereo vision system is a correspondence problem, where for each pixel in a reference image, we need to find its corresponding one in a target image. The term real-time will refer to disparity maps obtained in less than 33.3 ms—i.e., at least 30 frames per second (FPS) as in [5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.