FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

Thien-Thanh Dao,Won-Joo Hwang,Quoc-Viet Pham

doi:10.1109/access.2022.3145969

Abstract

A depth map helps robots and autonomous vehicles (AVs) visualize the three-dimensional world to navigate and localize neighboring obstacles. However, it is difficult to develop a deep learning model that can estimate the depth map from a single image in real-time. This study proposes a fast monocular depth estimation model named <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FastMDE</i> by optimizing the deep convolutional neural network according to the encoder-decoder architecture. The decoder needs to obtain partial and semantic feature maps from the encoding phase to improve the depth estimation accuracy. Therefore, we designed FastMDE with two effective strategies. The first one involved redesigning the skip connection with the features of the squeeze-excitation module to obtain partial and semantic feature maps of the encoding phase. The second strategy involved redesigning the decoder by using the fusion dense block to permit the usage of high-resolution features that were learned earlier in the network before upsampling. The proposed FastMDE model utilizes only 4.1 M parameters, which is much lesser than the parameters utilized by state-of-art models. Thus, FastDME has a higher accuracy and lower latency than previous models. This study also demonstrates that MDE can leverage deep neural networks in real-time (i.e., 30 fps) with the Linux embedded board Nvidia Jetson Xavier NX. The model can facilitate the development and applications with superior performances and easy deployment on an embedded platform.

Highlights

Depth map prediction from a single image is a fundamental aspect of several applications that involve three-dimensional (3D) visualizations of the real world
A lightweight convolutional network architecture named FastMDE was developed in this study by applying a novel skip connection with features of the eSE module, dSE module and the Fusion Dense Block (fDense) block
The network utilized selfsupervised learning for fast monocular depth estimation (MDE) at high-resolution

Summary

INTRODUCTION

Depth map prediction from a single image is a fundamental aspect of several applications that involve three-dimensional (3D) visualizations of the real world. Since the depth estimation problem requires pixel-based information, the model needs the semantic features and spatial information of the object to predict its boundaries at a high resolution. The dense connection in the decoder part is redesigned, namely fDense, to learn the high-resolution features that are obtained from the skip connection and encoder features to produce highly-detailed edge information before the upsampling process. This allows the model to predict sharper edges at higher accuracy." The TensorRT engine model, which is deployed in the Nvidia Linux embedded board (Tested with Nvidia Xavier NX), is proposed.

RELATED STUDIES

C Conv1 1

EVALUATION KITTI DATASET

Method Dense fDense

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Unsupervised Monocular Depth Estimation Based on Residual Neural Network of Coarse–Refined Feature Extractions for Drone
Tao Huang ... Shuanfeng Zhao
Electronics | VOL. 8
Tao Huang, et. al.Tao Huang ... Shuanfeng Zhao
17 Oct 2019
Electronics | VOL. 8

Adversarial Learning for Depth and Viewpoint Estimation From a Single Image
Saddam Abdulwahab ... Miguel Angel Garcia
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30
Saddam Abdulwahab, et. al.Saddam Abdulwahab ... Miguel Angel Garcia
01 Sep 2020
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30

A Robust RGB-D SLAM using Deep Learning for Depth Map Improvement
Hao Zhang ... Hongyang Yu
-
Hao Zhang, et. al.Hao Zhang ... Hongyang Yu
20 Oct 2022
20 Oct 2022

Monocular depth estimation based on deep learning: An overview
Chaoqiang Zhao ... Chongzhen Zhang
Science China Technological Sciences | VOL. 63
Chaoqiang Zhao, et. al.Chaoqiang Zhao ... Chongzhen Zhang
10 Jun 2020
Science China Technological Sciences | VOL. 63

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access