LW-Net: A Lightweight Network for Monocular Depth Estimation

Cheng Feng,Zhen Chen,Bingbing Fan,Ming Li,Hao Chen,Congxuan Zhang

doi:10.1109/access.2020.3034751

Abstract

Existing self-supervised monocular depth estimation methods usually explore increasingly large networks to achieve accurate estimation results. However, larger networks are more difficult to train and require more storage space. To balance the network size and the computational accuracy, we propose in this article a compact lightweight network for monocular depth estimation, named LW-Net. First, we construct a compact network by designing an iterative decoder with shared weights and a lightweight pyramid encoder. The proposed network includes significantly fewer parameters than most of the existing monocular depth estimation networks. Second, we exploit a self-supervised training strategy by combining the proposed LW-Net model with a pose network, and we then use a hybrid loss function to train the decoder and encoder separately. The proposed training strategy results in the LW-Net model achieving a better performance in terms of estimation accuracy than other methods. Finally, we respectively run the proposed LW-Net model on the KITTI and Make3D datasets to conduct a comprehensive comparison with several state-of-the-art methods. The experimental results demonstrate that our method performs the best in terms of computational accuracy while utilizing the fewest parameters. Specifically, the model parameters of our method are reduced by 46.6%, the time cost is decreased by 7.69%, and the frame rate is increased by 5.19% compared with the existing state-of-the-art method.

Highlights

IntroductionDepth estimation is an indispensable part of many tasks.It is widely used in robots [1]-[2], UAVs (unmanned aerial vehicles) [3]-[4], autonomous cars [5], and many other areas [6]-[7]
Depth estimation is an indispensable part of many tasks.It is widely used in robots [1]-[2], UAVs [3]-[4], autonomous cars [5], and many other areas [6]-[7]
COMPARISON RESULTS ON THE KITTI EIGEN-SPILT DATASET

Summary

Introduction

Depth estimation is an indispensable part of many tasks.It is widely used in robots [1]-[2], UAVs (unmanned aerial vehicles) [3]-[4], autonomous cars [5], and many other areas [6]-[7]. With the development of machine learning [48]-[50] and deep learning [51]-[53], many computer vison tasks have been significantly improved. Compared with these sensor-based methods, the imagebased depth estimation method only needs one or more cameras, so it has a wider range of applications and makes it possible for small robots to have depth estimation ability. Multiview-based depth estimation usually uses a camera array for image acquisition and the redundant information between multiview images for depth estimation.

Methods

Results

Conclusion