Abstract

Single-shot 3D imaging and shape reconstruction has seen a surge of interest due to the ever-increasing evolution in sensing technologies. In this paper, a robust single-shot 3D shape reconstruction technique integrating the structured light technique with the deep convolutional neural networks (CNNs) is proposed. The input of the technique is a single fringe-pattern image, and the output is the corresponding depth map for 3D shape reconstruction. The essential training and validation datasets with high-quality 3D ground-truth labels are prepared by using a multi-frequency fringe projection profilometry technique. Unlike the conventional 3D shape reconstruction methods which involve complex algorithms and intensive computation to determine phase distributions or pixel disparities as well as depth map, the proposed approach uses an end-to-end network architecture to directly carry out the transformation of a 2D image to its corresponding 3D depth map without extra processing. In the approach, three CNN-based models are adopted for comparison. Furthermore, an accurate structured-light-based 3D imaging dataset used in this paper is made publicly available. Experiments have been conducted to demonstrate the validity and robustness of the proposed technique. It is capable of satisfying various 3D shape reconstruction demands in scientific research and engineering applications.

Highlights

  • Non-contact 3D shape reconstruction using the structured-light techniques is commonly used in a broad range of applications including machine vision, reverse engineering, quality assurance, multi-shot [12,13,14,15] and single-shot [16,17,18]

  • The multi-shot techniques are capable of capturing high-resolution 3D images at a limited speed and are widely used as industrial metrology for accurate shape reconstructions

  • The most reliable fringe projection profilometry (FPP) technique involves projecting a set of phase-shifted sinusoidal fringe patterns from a projector onto the objects, where the surface depth or height information is naturally encoded into the camera-captured fringe patterns for the subsequent 3D reconstruction process

Read more

Summary

Introduction

Non-contact 3D shape reconstruction using the structured-light techniques is commonly used in a broad range of applications including machine vision, reverse engineering, quality assurance, multi-shot [12,13,14,15] and single-shot [16,17,18]. The multi-shot techniques are capable of capturing high-resolution 3D images at a limited speed and are widely used as industrial metrology for accurate shape reconstructions. The single-shot techniques can acquire 3D images at a fast speed to deal with dynamic scenes and are receiving tremendous attention in the fields of entertainment and robotics. Sensors 2020, 20, 3718 of deep machine learning to the highly demanded single-shot 3D shape reconstructions has become feasible. In the machine learning field, deep convolutional neural networks (CNNs) have found numerous applications in object detection, image classification, scene understanding, medical image analysis, and natural language processing, etc. The recent advances of using the deep CNNs for image segmentation intend to make the network architecture an end-to-end learning process. Badrinarayanan et al [21] used an idea of upsampling the lowest of the encoder output to improve the resolution of the output with less computational resources

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call