Abstract

This article proposes a zero-shot learning-based framework for light field depth estimation, which learns an end-to-end mapping solely from an input light field to the corresponding disparity map with neither extra training data nor supervision of groundtruth depth. The proposed method overcomes two major difficulties posed in existing learning-based methods and is thus much more feasible in practice. First, it saves the huge burden of obtaining groundtruth depth of a variety of scenes to serve as labels during training. Second, it avoids the severe domain shift effect when applied to light fields with drastically different content or captured under different camera configurations from the training data. On the other hand, compared with conventional non-learning-based methods, the proposed method better exploits the correlations in the 4D light field and generates much superior depth results. Moreover, we extend this zero-shot learning framework to depth estimation from light field videos. For the first time, we demonstrate that more accurate and robust depth can be estimated from light field videos by jointly exploiting the correlations across spatial, angular, and temporal dimensions. We conduct comprehensive experiments on both synthetic and real-world light field image datasets, as well as a self collected light field video dataset. Quantitative and qualitative results validate the superior performance of our method over the state-of-the-arts, especially for the challenging real-world scenes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call