Unsupervised Learning of Monocular Depth and Ego-Motion in Outdoor/Indoor Environments

Ruipeng Gao,Chi Li,Weiwei Xing,Xuan Xiao,Lei Liu

doi:10.1109/jiot.2022.3151629

Abstract

Visual-based unsupervised learning <xref ref-type="bibr" rid="ref1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[1]</xref> – <xref ref-type="bibr" rid="ref2" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"/> <xref ref-type="bibr" rid="ref3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[3]</xref> has emerged as a promising approach in estimating monocular depth and ego-motion, avoiding intensive efforts on collecting and labeling the ground truth. However, they are still restrained by the brightness constancy assumption among video sequences, especially susceptible with frequent illumination variations or nearby textureless surroundings in indoor environments. In this article, we selectively combine the complementary strength of visual and inertial measurements, i.e., videos extract static and distinct features while inertial readings depict scale-consistent and environment-agnostic movements, and propose a novel unsupervised learning framework to predict both monocular depth and ego-motion trajectory simultaneously. This challenging task is solved by learning both forward and backward inertial sequences to eliminate inevitable noises, and reweighting visual and inertial features via gated neural networks in various environments or with user-specific moving dynamics. In addition, we also employ structure cues to produce scene depths from a single image and explore structure consistency constraints to calibrate the depth estimates in indoor buildings. Experiments on the outdoor KITTI data set and our dedicated indoor prototype reveal that our approach consistently outperforms the state of the art on both depth and ego-motion estimates. To the best of our knowledge, this is the first work to fuse visual and inertial data without any supervision signals for monocular depth and ego-motion estimation, and our solution remain effective and robust even in textureless indoor scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised Learning of Monocular Depth and Ego-Motion in Outdoor/Indoor Environments

Abstract

Talk to us

Similar Papers

More From: IEEE Internet of Things Journal

Lead the way for us

Journal: IEEE Internet of Things Journal	Publication Date: Sep 1, 2022
Citations: 9

Similar Papers

Unsupervised Learning of Depth Estimation Based on Attention Model from Monocular Images
Chao Zhang ... Jiaqi Liu
-
Chao Zhang, et. al.Chao Zhang ... Jiaqi Liu
01 Nov 2020
01 Nov 2020

Self-supervised learning of monocular depth using quantized networks
Keyu Lu ... Yonghu Zeng
Neurocomputing | VOL. 488
Keyu Lu, et. al.Keyu Lu ... Yonghu Zeng
06 Dec 2021
Neurocomputing | VOL. 488

Open perceptual binocular and monocular descriptors for stereoscopic 3D images and video characterization
Pierre Lebreton ... Patrick Le Callet
-
Pierre Lebreton, et. al.Pierre Lebreton ... Patrick Le Callet
01 May 2015
01 May 2015

MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation
Jun Liu ... Guoping Qiu
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 166
Jun Liu, et. al.Jun Liu ... Guoping Qiu
23 Jun 2020
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 166

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Learning of Monocular Depth and Ego-Motion in Outdoor/Indoor Environments

Abstract

Talk to us

Similar Papers

More From: IEEE Internet of Things Journal