A novel memory-efficient deep learning training framework via error-bounded lossy compression

Sian Jin,Guanpeng Li,Shuaiwen Leon Song,Dingwen Tao

doi:10.1145/3437801.3441597

Abstract

DNNs are becoming increasingly deeper, wider, and nonlinear due to the growing demands on prediction accuracy and analysis quality. When training a DNN model, the intermediate activation data must be saved in the memory during forward propagation and then restored for backward propagation. Traditional memory saving techniques such as data recomputation and migration either suffers from a high performance overhead or is constrained by specific interconnect technology and limited bandwidth. In this paper, we propose a novel memory-driven high performance CNN training framework that leverages error-bounded lossy compression to significantly reduce the memory requirement for training in order to allow training larger neural networks. Specifically, we provide theoretical analysis and then propose an improved lossy compressor and an adaptive scheme to dynamically configure the lossy compression error-bound and adjust the training batch size to further utilize the saved memory space for additional speedup. We evaluate our design against state-of-the-art solutions with four widely-adopted CNNs and the ImangeNet dataset. Results demonstrate that our proposed framework can significantly reduce the training memory consumption by up to 13.5× and 1.8× over the baseline training and state-of-the-art framework with compression, respectively, with little or no accuracy loss. The full paper can be referred to at https://arxiv.org/abs/2011.09017.

Full Text