Abstract

We review a class of methods that can be collected under the name nonlinear transform coding (NTC), which over the past few years have become competitive with the best linear transform codecs for images, and have superseded them in terms of rate-distortion performance under established perceptual quality metrics such as MS-SSIM. We assess the empirical rate-distortion performance of NTC with the help of simple example sources, for which the optimal performance of a vector quantizer is easier to estimate than with natural data sources. To this end, we introduce a novel variant of entropy-constrained vector quantization. We provide an analysis of various forms of stochastic optimization techniques for NTC models; review architectures of transforms based on artificial neural networks, as well as learned entropy models; and provide a direct comparison of a number of methods to parameterize the rate-distortion trade-off of nonlinear transforms, introducing a simplified one.

Highlights

  • There is no end in sight for the world’s reliance on multimedia communication

  • This paper reviews some of the recent developments in datadriven lossy compression; in particular, we focus on a class of methods that can be collectively called nonlinear transform coding (NTC), providing insights into its capabilities and challenges

  • In linear transform coding with a Gaussian source assumption, the probabilistic model P in eq (9) is typically considered to be a distribution factorized over each latent dimension, since the Karhunen–Loève Transform (KLT) factorizes the source

Read more

Summary

INTRODUCTION

There is no end in sight for the world’s reliance on multimedia communication. Digital devices have been increasingly permeating our daily lives, and with them comes the need to store, send, and receive images and audio ever more efficiently. Transform coding (TC) has been the method of choice for compressing this type of data source. It turns out that in combination with stochastic optimization methods, such as stochastic gradient descent (SGD), and massively parallel computational hardware, a nearly universal set of tools for function approximation has emerged. These tools have been used in the context of data compression [4]–[9]. This paper reviews some of the recent developments in datadriven lossy compression; in particular, we focus on a class of methods that can be collectively called nonlinear transform coding (NTC), providing insights into its capabilities and challenges. The last two sections discuss connections to related work and conclude the paper, respectively

STOCHASTIC RATE–DISTORTION OPTIMIZATION
Variational entropy-constrained vector quantization
NONLINEAR TRANSFORM CODING
Optimization and proxy rate–distortion loss
The soft quantization
Nonlinear transforms
LEARNED ENTROPY MODELS
RD TRAVERSAL WITH λ-PARAMETERIZATION
RELATED WORK
CONCLUSION
Local properties of nonlinear transforms
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call