Accurate dense depth from light field technology for object segmentation and 3D computer vision

Wai Y K San

doi:10.14264/uql.2020.887

Abstract

Depth estimation affects feature extraction, object segmentation and three-dimensional (3D) reconstruction. Practical applications impacted by depth estimation include autonomous vehicle navigation, computational photography, augmented and visual reality. Recent advancements in neural network algorithms have been linked to an increase in depth estimation accuracy. With the aid of graphics processing units (GPU), examples where neural networks have been used for improving depth estimation accuracy are depth from video, multiple images, stereo image pairs or a single image. Despite the emphasis for neural networks to improve depth estimation algorithms, the datasets used in experimental evaluations have not seen the same amount of attention. The benchmark datasets for depth estimation may be lacking in efficiency for obtaining real data, high-resolution images or scenes containing complex object structures. Consequently, this results in many state-of-the-art depth estimation algorithms to fail in practical applications since these scenarios are not considered during the training procedure. Recent state-of-the-art depth estimation algorithms declare that their methodologies are not suitable for occlusion or non-Lambertian surfaces. The Lambertian approximation is defined by two cases. The first case is different viewpoints from multiple cameras are photo-consistent. The second case is objects are composed by a collection of piecewise, planar surfaces. Since many objects in reality are not bound by these two cases, algorithms assuming Lambertian approximations produce low accuracy in practical applications. In this thesis, we propose a cost-efficient, high-resolution dataset that contains scenes of challenging object shapes. The dataset is acquired using light field technology for depth estimation from a Lytro camera. The proposed dataset contains high-resolution images with objects addressing the Lambertian approximation problem. This dataset aims to improve the accuracy of current depth estimation algorithms by instigating difficult evaluations observed in real-world scenarios. The depth information used for ground truth data in our proposed dataset is adapted from the Lytro software for converting the four dimensional (4D) light field file into the isolated depth channel. In comparison to benchmark datasets, this is a cost-effective solution for acquiring real data. We also propose a detailed study and utilisation of generative adversarial networks to predict depth from a single view. During the training procedure of the generative adversarial network, a loss function is optimized for the purpose of depth prediction from a single image. We analyse the generative adversarial method for depth from a single image on the proposed depth dataset in addition to benchmark depth datasets.

Full Text