Steered Mixture-of-Experts for Light Field Images and Video: Representation and Coding

Ruben Verhack,Thomas Sikora,Glenn Van Wallendael,Peter Lambert

doi:10.1109/tmm.2019.2932614

Ruben Verhack, Thomas Sikora + Show 2 more

Open Access

https://doi.org/10.1109/tmm.2019.2932614

Copy DOI

Abstract

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution.

Highlights

V IRTUAL reality (VR) for camera-captured scenes is fundamentally different and more complex compared to VR consumption of computer-generated (CG) scenes
A light field (LF) video with 10 × 10 viewpoints, in full HD at 30 fps yields 6,220,800,000 pixels per second! Sparse representations such as Steered Mixture-of-Experts (SMoE) are hugely beneficial for such higher-dimensional modalities as a single kernel can span over a large number of pixels spread out over five dimensions simultaneously
Four RD-points of the three High Efficiency Video Coder (HEVC) configurations and for the minibatch SMoE method were selected in the lowest range, as this was assumed to cover the highest variance in Mean Opinion Scores (MOS) scores

Summary

INTRODUCTION

V IRTUAL reality (VR) for camera-captured scenes is fundamentally different and more complex compared to VR consumption of computer-generated (CG) scenes (e.g. as in gaming). The second strategy relies on known hybrid transform/difference-coding techniques commonly used for video Following this philosophy, scenes are represented by coding a minimal set of 2-D images, and reconstructing the missing ones by view synthesis. The reconstruction is not truly pixel-level parallel due to the intra-coding techniques These systems do not cope well with irregularly-sampled data and heterogeneous camera setups in scene acquisition systems. We focus on LF images and video, and the reconstruction performance with coded model parameters as a LF compression tool. Other applications of this representation are superresolution, denoising, segmentation, etc.

RELATED WORK

Introduction

Mixture-of-Experts Based on GMMs

Robust Modeling of GMMs for SMoE

Example

PROPOSED CODING SCHEME

Local Modeling

APPROXIMATION AND CODING EXPERIMENTS

Light Field Image Coding

Light Field Video Coding

CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Multimedia	Publication Date: Aug 13, 2019
Citations: 74	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Steered Mixture-of-Experts for Light Field Images and Video: Representation and Coding

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Similar Papers

4DLFVD
Xinjue Hu ... Yumei Wang
-
Xinjue Hu, et. al.Xinjue Hu ... Yumei Wang
24 Jun 2021
24 Jun 2021

An Interactive Light Field Video System with User-Dependent View Selection and Coding Scheme
Bing Wang ... Qiang Peng
-
Bing Wang, et. al.Bing Wang ... Qiang Peng
01 Jan 2018
01 Jan 2018

Multiple Description Coding for Best-Effort Delivery of Light Field Video Using GNN-Based Compression
Xinjue Hu ... Yumei Wang
IEEE Transactions on Multimedia | VOL. 25
Xinjue Hu, et. al.Xinjue Hu ... Yumei Wang
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

5D Light Field Synthesis from a Monocular Video
Kyuho Bae ... Andre Ivan
-
Kyuho Bae, et. al.Kyuho Bae ... Andre Ivan
10 Jan 2021
10 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Steered Mixture-of-Experts for Light Field Images and Video: Representation and Coding

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia