Abstract

The truncated signed distance function (TSDF) fusion is one of the key operations in the 3D reconstruction process. However, existing TSDF fusion methods usually suffer from the inevitable sensor noises. In this paper, we propose a new TSDF fusion network, named DFusion, to minimize the influences from the two most common sensor noises, i.e., depth noises and pose noises. To the best of our knowledge, this is the first depth fusion for resolving both depth noises and pose noises. DFusion consists of a fusion module, which fuses depth maps together and generates a TSDF volume, as well as the following denoising module, which takes the TSDF volume as the input and removes both depth noises and pose noises. To utilize the 3D structural information of the TSDF volume, 3D convolutional layers are used in the encoder and decoder parts of the denoising module. In addition, a specially-designed loss function is adopted to improve the fusion performance in object and surface regions. The experiments are conducted on a synthetic dataset as well as a real-scene dataset. The results prove that our method outperforms existing methods.

Highlights

  • Depth fusion is of great importance for many applications, such as augmented reality applications and autonomous driving

  • To compare with state-of-the-art methods, our method is evaluated with four metrics, i.e., the mean squared error (MSE), the mean absolute distance (MAD), intersection over union (IoU) and accuracy (ACC)

  • MSE and MAD mainly focus on the distance between the estimated truncated signed distance function (TSDF) and the ground truth, while IoU and ACC quantify the occupancy of the estimation

Read more

Summary

Introduction

Depth fusion is of great importance for many applications, such as augmented reality applications and autonomous driving. TSDF requires manual adjustment on its parameters, possibly leading to thick artifacts. To address this problem, some depth fusion methods have emerged with improved performance. Some depth fusion methods have emerged with improved performance Methods such as [2,3] use surfel-based or probabilistic approaches to generate 3D representations, which may be a voxel grid, a mesh or a point cloud. Compared with these classical methods, convolutional neural network (CNN) based methods have shown advantages in the fusion performance. Their results still suffer from noisy input, which results in missing surface details and incomplete geometry [4]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.