Abstract

In order to improve the predication accuracy with low execution time in the process of image depth map generation, we mainly investigate the unsupervised monocular image depth prediction. In this paper, an unsupervised monocular image depth prediction method based on multiple loss deep learning is designed from following two aspects. First, a monocular image depth estimation algorithm based on multi-scale feature extraction is proposed, which includes two parts: a feature extraction network and a deconvolution prediction network. The feature extraction network extracts image features at different levels of the network and introduces the acquired multi-scale features into the deconvolution layer, without changing the image resolution. Through training, the left and right disparity map can be eventually predicted. Second, we provide a new multiple loss function with the asymmetric parameters of the training model and constraint theorem of polar geometry. The Multi-Scale-Structural Similarity Index (MS-SSIM) algorithm and L1 algorithm are combined as the loss function of image reconstruction, the left-right disparity consistency and the flipped left-right disparity consistency are incorporated in the loss function of the network model training. The simulation results show that this method can effectively improve the prediction results accuracy, particularly for complex images with mirrors, transparent, and shadows. KITTI dataset is further utilized to evaluate our method, which can achieve end-to-end results that even exceed those of a supervised method.

Highlights

  • As a fundamental problem of computer vision, image depth estimation has received significant attentions in both industrial and academic areas

  • We propose an unsupervised monocular image depth prediction algorithm and the simulation results show that the method improves the accuracy of image depth prediction

  • We propose an unsupervised monocular image depth prediction algorithm based on multiple loss deep learning

Read more

Summary

INTRODUCTION

As a fundamental problem of computer vision, image depth estimation has received significant attentions in both industrial and academic areas. To solve this problem, lots of research studies emerges based on monocular depth estimation algorithms with supervised learning [5], [6] These methods directly train a convolutional neural network (CNN) by using a large amount of ground truth depth data, and the trained model directly predicts the depth of each pixel in the image. We propose an unsupervised monocular image depth prediction algorithm based on multiple loss deep learning This network architecture can obtain left and right disparity maps without the ground truth depth. Proposed monocular image depth estimation architecture of the CNN structure based on ResNet-50, The blocks C (yellow), P (red), d(blue), b (purple) correspond to convolution, max pooling, disparity maps and blocks. We incorporate the flipped left-right disparity consistency into the network model training loss function, so that the postprocessing step directly into our network This significantly reduces the testing time of our images. In (6), xf is the pixel coordinates in the input flipped left image. dxlff is the disparity map of the flipped left image, dflxf is the horizontal flipped of the left disparity map dxl , dflxf and dxlff are theoretically equal

EXPERIMENT
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.