DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks

Simon Wiedemann,Tung Nguyen,Arturo Marban,Detlev Marpe,Paul Haase,Heiko Schwarz,Wojciech Samek,Heiner Kirchhoffer,Stefan Matlage,David Neumann,Thomas Wiegand,Talmaj Marinc

doi:10.1109/jstsp.2020.2969554

Abstract

In the past decade deep neural networks (DNNs) have shown state-of-the-art performance on a wide range of complex machine learning tasks. Many of these results have been achieved while growing the size of DNNs, creating a demand for efficient compression and transmission of them. In this work we present DeepCABAC, a universal compression algorithm for DNNs that is based on applying Context-based Adaptive Binary Arithmetic Coder (CABAC) to the DNN parameters. CABAC was originally designed for the H.264/AVC video coding standard and became the state-of-the-art for the lossless compression part of video compression. DeepCABAC applies a novel quantization scheme that minimizes a rate-distortion function while simultaneously taking the impact of quantization to the DNN performance into account. Experimental results show that DeepCABAC consistently attains higher compression rates than previously proposed coding techniques for DNN compression. For instance, it is able to compress the VGG16 ImageNet model by x63.6 with no loss of accuracy, thus being able to represent the entire network with merely 9 MB. The source code for encoding and decoding can be found at https://github.com/fraunhoferhhi/DeepCABAC.

Highlights

It has been well established that deep neural networks excel at solving many complex machine learning tasks [1], [2]
3) In our experiments we show that DeepCABAC is able to attain very high compression ratios and that it consistently attains a higher compression performance than previously proposed coders
We have reviewed some fundamental results of source coding theory

Summary

Introduction

It has been well established that deep neural networks excel at solving many complex machine learning tasks [1], [2] Their relatively recent success can be attributed to three phenomena: 1) access to large amounts of data, 2) researchers having designed novel optimization algorithms and model architectures that allow to train very deep neural networks, 3) the increasing availability of compute resources [1]. The latter two allowed machine learning practitioners to equip neural networks with an ever-growing number of layers and, to consistently attain state-of-the-art results on a wide spectrum of complex machine learning tasks. [3] show that the memory-energy efficiency trends of most common hardware platforms are not able to keep up with the exponential growth of the neural networks’ sizes, expecting them to be more and more power hungry over time

Objectives

Methods

Conclusion