Multi-Scale Fusion Networks with RGB Image Features for Depth Map Completion

Bolun Zheng,Chuhua Xian,Dongjiu Zhang

doi:10.3724/sp.j.1089.2021.18861

Abstract

<p indent=0mm>Currently, researchers cannot directly train end-to-end model for depth image completion because of lacking paired “incomplete-complete” RGB-D datasets. To address this problem, a random mask-based method which is combined with “real-synthetic” data for joint training strategy is proposed to construct paired incomplete-complete RGB-D data. This method generates depth maps with different missing ratios based on random masks, and uses synthetic scene datasets to construct missing regions of depth map with high fidelity truth values. Based on this strategy, a multi-scale depth map completion network is constructed, which fuses the corresponding RGB image features. The proposed network extracts RGB image features and depth map features with different scales from RGB image branches and depth map branches. Then, in the feature fusion branch, the RGB image features and depth map features are fused at different scales, which makes that the rich semantic features of RGB images and the information of depth maps can be effectively integrated for depth map completion. Experiments on the NYU-Depth V2 dataset show that in depth completion tasks with different missing ratios, the average threshold accuracy of this method is 0.98, and the mean relative error is about 0.061. Compared with the existing methods based on neural networks and optimizing sparse equations, this method improves the threshold accuracy by an average of 0.02, and the mean relative error decreases by an average of 0.027.

Full Text