Efficient Light Field Image Compression with Enhanced Random Access

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

In light field image compression, facilitating random access to individual views plays a significant role in decoding views quickly, reducing memory footprint, and decreasing the bandwidth requirement for transmission. Highly efficient light field image compression methods mainly use inter view prediction. Therefore, they typically do not provide random access to individual views. On the other hand, methods that provide full random access usually reduce compression efficiency. To address this trade-off, a light field image encoding method that favors random access is proposed in this paper. Light field image views are grouped into independent (3× 3) views, which are called Macro View Images (MVIs) . To encode MVIs, the central view is used as a reference to compress its adjacent neighboring views using a hierarchical reference structure. To encode the central view of each MVI, the most central view along with the center of a maximum of three MVIs, are used as reference images for the disparity estimation. In addition, the proposed method allows the use of parallel processing to reduce the maximum encoding/decoding time-complexity in multi-core processors. Tile partitioning can also be used to randomly access different regions of the light field images. The simulation results show that the proposed method outperforms other state-of-the-art methods in terms of compression efficiency while providing random access to both views and regions of interest.

Similar Papers
  • Conference Article
  • Cite Count Icon 3
  • 10.1109/dcc50243.2021.00012
SLFC: Scalable Light Field Coding
  • Mar 1, 2021
  • Hadi Amirpour + 2 more

Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views, they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabil- ities, random access, and uniform quality distribution have not been addressed adequately. In this paper, an efficient light field image compression method based on a deep neural network is proposed, which classifies multiple views into various layers. In each layer, the target view is synthesized from the available views of previously encoded/decoded layers using a deep neural network. This synthesized view is then used as a virtual reference for the target view inter-coding. In this way, random access to an arbitrary view is provided. Moreover, uniform quality distribution among multiple views is addressed. In higher bitrates where random access to an arbitrary view is more crucial, the required bitrate to access the requested view is minimized.

  • Conference Article
  • Cite Count Icon 22
  • 10.1109/icip.2018.8451731
Macro-Pixel Prediction Based on Convolutional Neural Networks for Lossless Compression of Light Field Images
  • Oct 1, 2018
  • Ionut Schiopu + 1 more

The paper introduces a novel macro-pixel prediction method based on Convolutional Neural Networks (CNN) for lossless compression of light field images. In the proposed method, each macro-pixel is predicted based on a volume of macro-pixels from its immediate causal neighborhood. The proposed deep neural network operates on these macro-pixel volumes and provides accurate macro-pixel prediction in light field images. The resulting macro-pixel residuals are encoded by a reference codec built based on the CALIC codec. A context modeling method for light field images is proposed. Experimental results on a large light field image dataset show that the proposed prediction method systematically and substantially outperforms state-of-the-art predictors. To our knowledge, the paper is the first to introduce deep-learning based prediction of macro-pixels, enabling efficient lossless compression of light field images.

  • Conference Article
  • Cite Count Icon 10
  • 10.1109/dcc.2019.00065
Light Field Image Compression with Random Access
  • Mar 1, 2019
  • Hadi Amirpour + 4 more

In light field compression, besides coding efficiency, providing random access to individual views is also a very significant factor. Highly efficient compression methods usually lack random access. Similarly, random access methods usually reduce the compression efficiency. To address this trade-off, a light field image encoding method is proposed in this paper which favors random access. In the proposed scheme 15×15 view images are divided into 25 independent 3×3 view images which are called Macro View Image (MVI). To encode MVIs, the central view image is used to compress its immediate neighboring view images using a hierarchical reference structure. To encode the central view of each MVI, the most central view image, along with the center of at most three MVIs, are used as the reference images for the disparity estimation. In addition, the proposed method enables the use of parallel computation to improve encoding/decoding time complexity. To reduce memory footprint in case a Region of Interest (ROI) is required, HEVC tile partitioning is used.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-319-77380-3_8
Light Field Image Compression Scheme Based on MVD Coding Standard
  • Jan 1, 2018
  • Xinpeng Huang + 3 more

In this paper, we propose a new Light Field Image (LFI) compression scheme based on Multiview Video plus Depth (MVD) coding architecture. Through LF function analysis, we preliminarily estimate depth map according to the concept of Epipolar Plane Image (EPI). Such a rough estimation causes some error pixels within initial depth map, so we design a weighting mean filter to smooth the inaccurate region. The final estimated depth maps can be encoded by MVD coding standard jointly with a small number of viewpoint images in LFI, so as to improve compression efficiency of LFI. Ultimately, massive experiments are conducted on 4 LFIs to verify the effectiveness of the proposed compression scheme. The simulated results demonstrate that our LFI compression scheme can achieve a high LFI compression performance and outperform the-state-of-art coding solution.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/icip.2004.1421601
Rate-distortion analysis of random access for compressed light fields
  • Oct 24, 2004
  • P Ramanathan + 1 more

Image-based rendering data sets, such as light fields, require efficient compression due to their large data size, but also easy random access when rendering from the data set. Efficient compression usually depends upon prediction between images, which creates dependencies between them, conflicting with the requirement of having easy random access. Existing light field coders concentrate either on compression efficiency, or use ad hoc methods to design prediction that balances random access and compression efficiency requirements. In this paper, we study this joint problem of compression efficiency and random access. We propose a model for light field image generation, light field image coding and rendering novel views from these light field images. We present a view-dependent rate-distortion measure that allows us to consider random access and compression efficiency simultaneously. We compare the theoretical results from the model with the experimental results from our DCT-based coder and show that they qualitatively give similar results. Finally, we suggest how. with this model, we can better optimize the prediction dependency structure in our coder for random access and compression efficiency performance.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-3-319-77842-6_6
Light Field Image Compression
  • Jul 29, 2018
  • Caroline Conti + 8 more

Light field imaging based on a single-tier camera equipped with a micro-lens array has currently risen up as a practical and prospective approach for future visual applications and services. However, successfully deploying actual light field imaging applications and services will require identifying adequate coding solutions to efficiently handle the massive amount of data involved in these systems. In this context, this chapter presents some of the most recent light field image coding solutions that have been investigated. After a brief review of the current state of the art in image coding formats for light field photography, an experimental study of the rate-distortion performance for different coding formats and architectures is presented. Then, aiming at enabling faster deployment of light field applications and services in the consumer market, a scalable light field coding solution that provides backward compatibility with legacy display devices (e.g., 2D, 3D stereo, and 3D multiview) is also presented. Furthermore, a light field coding scheme based on a sparse set of microimages and the associated blockwise disparity is also presented. This coding scheme is scalable with three layers such that the rendering can be performed with the sparse micro-image set, the reconstructed light field image, and the decoded light field image.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-3-319-95504-9_24
Light Field Image Compression
  • Jan 1, 2018
  • Li Li + 1 more

The light field image, also known as the plenoptic image, contains the information about not only the intensity of light in a scene but also the direction of the light rays in space. Since the light field image contains very rich photometric and geometric information, it will have very widespread application in the future. For example, immersive content capture for virtual and mixed reality presentation or depth from light field for auto driving applications. To be more specific, a light field image can be enhanced with physical models for an autonomous decision making process, which is also an important task of the Dynamic Data Driven Applications Systems (DDDAS) [11]. Besides, the rich geometry and photometric information contained in the light field can be updated with real-time measurements, which is a focus of DDDAS such as smart city related image and video processing tasks. However, to make the light field images easier to be utilized, one of the most important tasks is to compress the light field images efficiently so they can be easily distributed over the current communication infrastructure.

  • Conference Article
  • Cite Count Icon 11
  • 10.1109/icassp.2019.8682820
Light Field Image Compression Using Depth-based CNN in Intra Prediction
  • May 1, 2019
  • Tingting Zhong + 3 more

Recently, light field images have received extensive attention due to their potential applications. Since they take up a huge memory because of its super-high resolution, efficient compression methods are fundamentally required. In this paper, we propose a novel intra prediction mode by using depth-adaptive convolutional neuro network (DCNN). Light field projection finds the imaging response distribution for each object point using the depth estimated from each macropixel in the light field image. The highly correlated imaging responses are used to select the neural network structure. The network structure also adapts to the to-be-encoded block size. Adding the proposed DCNN-based prediction mode into the rate-distortion optimization loop with other 35 intra prediction modes of HEVC, the proposed encoding scheme achieves a significant bit-rate saving compared to representative compression approaches with limited computational complexity increment. Statistical data are also provided and analyzed to demonstrate the efficiency of the proposed method.

  • Conference Article
  • Cite Count Icon 32
  • 10.1109/pcs50896.2021.9477448
Learning-Based Practical Light Field Image Compression Using A Disparity-Aware Model
  • Jun 1, 2021
  • Mohana Singh + 1 more

Light field technology has increasingly attracted the attention of the research community with its many possible applications. The lenslet array in commercial plenoptic cameras helps capture both the spatial and angular information of light rays in a single exposure. While the resulting high dimensionality of light field data enables its superior capabilities, it also impedes its extensive adoption. Hence, there is a compelling need for efficient compression of light field images. Existing solutions are commonly composed of several separate modules, some of which may not have been designed for the specific structure and quality of light field data. This increases the complexity of the codec and results in impractical decoding runtimes. We propose a new learning-based, disparity-aided model for compression of 4D light field images capable of parallel decoding. The model is end-to-end trainable, eliminating the need for hand-tuning separate modules and allowing joint learning of rate and distortion. The disparity-aided approach ensures the structural integrity of the reconstructed light fields. Comparisons with the state of the art show encouraging performance in terms of PSNR and MS-SSIM metrics. Also, there is a notable gain in the encoding and decoding runtimes. Source code is available at https://moha23.github.io/LF-DAAE.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/pcs.2016.7906404
L1-optimized linear prediction for light field image compression
  • Jan 1, 2016
  • Rui Zhong + 5 more

The advent of consumer-level plenoptic cameras has sparkled the interest towards the design of efficient compression techniques for light field images. State-of-the-art compression systems such as HEVC prove to be inefficient when directly applied on this type of data due to the inherent spatial discontinuities among neighboring microlens images. In this paper, a novel light field image compression system is proposed. The disk-shaped pixel clusters corresponding to each microlens in the light field image are efficiently predicted based on the neighboring disks. In this context, an optimized linear prediction design based on L1 minimization of the residuals is proposed. K-means clustering is employed on training data in order to determine the optimized set of predictors. The experimental results on an extensive set of light field images demonstrate that the proposed coding scheme yields an average of 2.93 dB and 3.22 dB gain in PSNR, and 52.67% and 57.27% average rate savings compared to HEVC and JPEG2000 respectively.

  • Conference Article
  • Cite Count Icon 23
  • 10.1109/icassp43922.2022.9747377
SADN: Learned Light Field Image Compression with Spatial-Angular Decorrelation
  • May 23, 2022
  • Kedeng Tong + 3 more

Light field image becomes one of the most promising media types for immersive video applications. In this paper, we propose a novel end-to-end spatial-angular-decorrelated network (SADN) for high-efficiency light field image compression. Different from the existing methods that exploit either spatial or angular consistency in the light field image, SADN decouples the angular and spatial information by dilation convolution and stride convolution in spatial-angular interaction, and performs feature fusion to compress spatial and angular information jointly. To train a stable and robust algorithm, a large-scale dataset consisting of 7549 light field images is proposed and built. The proposed method provides 2.137 times and 2.849 times higher compression efficiency relative to H.266/VVC and H.265/HEVC inter coding, respectively. It also outperforms the end-to-end image compression networks by an average of 79.6% bitrate saving with much higher subjective quality and light field consistency.

  • Conference Article
  • Cite Count Icon 10
  • 10.1109/dcc47342.2020.00047
Light Field Image Compression Using Multi-branch Spatial Transformer Networks Based View Synthesis
  • Mar 1, 2020
  • Jin Wang + 4 more

The recent years have witnessed the widespread of light field imaging in interactive and immersive visual applications. To record the directional information of the light rays, larger storage space is required by light field images compared with conventional 2D images. Hence, the efficient compression of light field image is highly desired for further applications. In this paper, we propose a novel light field image compression scheme using multi-branch spatial transformer networks based view synthesis. Firstly, a sparse subset of views are selected and are rearranged into a pseudo sequence to be encoded by a video codec at encoder. Then the other unselected views are synthesized based on the similarity between neighboring views with our proposed method at decoder. To better characterize the non-linear relationship between the sub-views, a multi-branch spatial transformer networks (MSTN) is designed to adaptively learn the affine transformations between the neighboring views, which are used to warp the input views to generate accurate approximation of the target views. Moreover, to better obtain the final view by the generated approximation views, the Wasserstein generative adversarial networks(WGAN) is applied with the improved training. Experimental results show the superior compression performance of our scheme compared with the state-of-the-art methods.

  • Research Article
  • Cite Count Icon 22
  • 10.1109/tmm.2021.3068563
Light Field Image Coding Using VVC Standard and View Synthesis Based on Dual Discriminator GAN
  • Jan 1, 2021
  • IEEE Transactions on Multimedia
  • Nader Bakir + 4 more

Light field (LF) technology is considered as a promising way for providing a high-quality virtual reality (VR) content. However, such an imaging technology produces a large amount of data requiring efficient LF image compression solutions. In this paper, we propose a LF image coding method based on a view synthesis and view quality enhancement techniques. Instead of transmitting all the LF views, only a sparse set of reference views are encoded and transmitted, while the remaining views are synthesized at the decoder side. The transmitted views are encoded using the versatile video coding (VVC) standard and are used as reference views to synthesize the dropped views. The selection of non-reference dropped views is performed using a rate-distortion optimization based on the VVC temporal scalability. The dropped views are reconstructed using the LF dual discriminator GAN (LF-D2GAN) model. In addition, to ensure that the quality of the views is consistent, at the decoder, a quality enhancement procedure is performed on the reconstructed views allowing smooth navigation across views. Experimental results show that the proposed method provides high coding performance and overcomes the state-of-the-art LF image compression methods by –36.22% in terms of BD-BR and 1.35 dB in BD-PSNR. The web page of this work is available at https://naderbakir79.github.io/LFD2GAN.html .

  • Research Article
  • Cite Count Icon 8
  • 10.1109/tip.2022.3223787
Advanced Scalability for Light Field Image Coding.
  • Jan 1, 2022
  • IEEE Transactions on Image Processing
  • Hadi Amirpour + 3 more

Light field imaging, which captures both spatial and angular information, improves user immersion by enabling post-capture actions, such as refocusing and changing view perspective. However, light fields represent very large volumes of data with a lot of redundancy that coding methods try to remove. State-of-the-art coding methods indeed usually focus on improving compression efficiency and overlook other important features in light field compression such as scalability. In this paper, we propose a novel light field image compression method that enables (i) viewport scalability, (ii) quality scalability, (iii) spatial scalability, (iv) random access, and (v) uniform quality distribution among viewports, while keeping compression efficiency high. To this end, light fields in each spatial resolution are divided into sequential viewport layers, and viewports in each layer are encoded using the previously encoded viewports. In each viewport layer, the available viewports are used to synthesize intermediate viewports using a video interpolation deep learning network. The synthesized views are used as virtual reference images to enhance the quality of intermediate views. An image super-resolution method is applied to improve the quality of the lower spatial resolution layer. The super-resolved images are also used as virtual reference images to improve the quality of the higher spatial resolution layer. The proposed structure also improves the flexibility of light field streaming, provides random access to the viewports, and increases error resiliency. The experimental results demonstrate that the proposed method achieves a high compression efficiency and it can adapt to the display type, transmission channel, network condition, processing power, and user needs.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/euvip53989.2022.9922749
FuRA: Fully Random Access Light Field Image Compression
  • Sep 11, 2022
  • Hadi Amirpour + 2 more

Light fields are typically represented by multi-view images, and enable post-capture actions such as refocusing and perspective shift. To compress a light field image, its view images are typically converted into a pseudo video sequence (PVS) and the generated PVS is compressed using a video codec. However, when using the inter-coding tool of a video codec to exploit the redundancy among view images, the possibility to randomly access any view image is lost. On the other hand, when video codecs independently encode view images using the intra-coding tool, random access to view images is enabled, however, at the expense of a significant drop in the compression efficiency. To address this trade-off, we propose to use neural representations to represent 4D light fields. For each light field, a multi-layer perceptron (MLP) is trained to map the light field four dimensions to the color space, thus enabling random access even to pixels. To achieve higher compression efficiency, neural network compression techniques are deployed. The proposed method outperforms the compression efficiency of HEVC inter-coding, while providing random access to view images and even pixel values.

Save Icon
Up Arrow
Open/Close