The angular information of light lost in conventional images but preserved and stored in light-fields plays an instrumental role in many applications such as depth estimation, 3D reconstruction and post-capture refocusing. However, the limited angular resolution of light-field images due to the consumer hardware limitations is a major drawback in its widespread adoption. In this article, we present a novel deep learning-based light-field view synthesis method from a sparse set of input views. Our proposed method, an end-to-end trainable network, utilizes convolutional block attention modules to enhance the built-in depth image-based rendering. The proposed convolutional block attention module consists of two attention modules sequentially applied in the channel and spatial dimensions to focus the network on critical features. The proposed network architecture is a combination of three sub-networks, one for stereo feature extraction, another for disparity estimation, and the last for attention-based refinement. We present two schemes for the refinement network that perform equally well but differ in the number of parameters by 1.2% and execution time by 44%. Quantitative and qualitative results on four challenging real-world datasets show the superiority of the proposed method. Our proposed method shows considerable PSNR gains, around 1 dB compared to the state of the art and around 0.5 dB over our previous method LFVS-AM. In addition, ablation studies demonstrate the effectiveness of each module of the proposed method. Finally, the parallax edge precision-recall curve shows that our method better preserves parallax details.
Read full abstract