HI-Net: Boosting Self-Supervised Indoor Depth Estimation via Pose Optimization

Guanghui Wu,Ruizhen Hu,Yulan Guo,Kunhong Li,Zengping Chen,Longguang Wang

doi:10.1109/lra.2022.3224654

Abstract

Pose estimation plays a critical role in self-supervised monocular depth estimation for indoor scenes, especially those involving complex ego-motion. In this letter, we leverage the two-view geometry constraints into pose estimation to boost the accuracy of pose estimation, which ultimately improves the performance of self-supervised depth estimation. Specifically, we decompose pose estimation into two steps: initial homography estimation and iterative residual refinement. We first introduce a Homography Estimation Module (HEM) to estimate large 3-DoF rotations. Then, we refine the 6-DoF residual pose estimation with an Iterative Residual Refinement Module (IRM). Finally, the supervision signal is generated with the refined pose and used for the training of DepthNet. Experiments on the NYU depth V2 dataset show that our pose estimation approach significantly improves the performance of DepthNet, and the proposed method achieves state-of-the-art depth estimation results. Furthermore, experiments on the ScanNet dataset demonstrate the generalization ability of our method for both pose estimation and depth estimation.

Full Text