Abstract

An efficient deep learning requires a memory-efficient construction of a neural network. This paper introduces a layerwise tensorized formulation of a multilayer neural network, called LTNN, such that the weight matrix can be significantly compressed during training. By reshaping the multilayer neural network weight matrix into a high-dimensional tensor with a low-rank approximation, significant network compression can be achieved with maintained accuracy. An according layerwise training is developed by a modified alternating least-squares method with backward propagation for fine-tuning only. LTNN can provide the state-of-the-art results on various benchmarks with significant compression. For MNIST benchmark, LTNN shows 64 × compression rate without accuracy drop. For Imagenet12 benchmark, our proposed LTNN achieves 35.84 × compression of the neural network with around 2% accuracy drop. We have also shown 1.615 × faster on inference speed than the existing works due to the smaller tensor core ranks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call