Abstract
In this paper, we propose a novel filtering method based on deep attention networks for the quality enhancement of light field (LF) images captured by plenoptic cameras and compressed using the High Efficiency Video Coding (HEVC) standard. The proposed architecture was built using efficient complex processing blocks and novel attention-based residual blocks. The network takes advantage of the macro-pixel (MP) structure, specific to LF images, and processes each reconstructed MP in the luminance (Y) channel. The input patch is represented as a tensor that collects, from an MP neighbourhood, four Epipolar Plane Images (EPIs) at four different angles. The experimental results on a common LF image database showed high improvements over HEVC in terms of the structural similarity index (SSIM), with an average Y-Bjøntegaard Delta (BD)-rate savings of and an average Y-BD-PSNR improvement of dB. Increased performance was achieved when the HEVC built-in filtering methods were skipped. The visual results illustrate that the enhanced image contains sharper edges and more texture details. The ablation study provides two robust solutions to reduce the inference time by and the network complexity by . The results demonstrate the potential of attention networks for the quality enhancement of LF images encoded by HEVC.
Highlights
IntroductionIn contrast to conventional Red-Green-Blue (RGB) cameras, which only capture light intensity, plenoptic cameras provide the unique ability of distinguishing between the light rays that hit the camera sensor from different directions using microlens technology
The raw light field (LF) image contains the entire information captured by the camera sensor, where the array of microlenses generates a corresponding array of MPs, a structure known as lenslet images
We proposed a novel Convolutional Neural Network (CNN)-based filtering method for the quality enhancement of LF images compressed by High Efficiency Video Coding (HEVC)
Summary
In contrast to conventional Red-Green-Blue (RGB) cameras, which only capture light intensity, plenoptic cameras provide the unique ability of distinguishing between the light rays that hit the camera sensor from different directions using microlens technology. To this end, the main lens of plenoptic cameras focus light rays onto a microlens plane, and each microlens captures the incoming light rays from different angles and directs them onto the camera sensor. The raw LF image contains the entire information captured by the camera sensor, where the array of microlenses generates a corresponding array of MPs, a structure known as lenslet images.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have