Improved Compression of Artificial Neural Networks through Curvature-Aware Training

Reemt Hinrichs,Jorn Ostermann,Kai Liang,Ze Lu

doi:10.1109/ijcnn55064.2022.9892511

Abstract

Artificial neural networks achieve state-of-the-art performance in many branches of engineering. As such they are used for all kinds of tasks and nowadays are desired to be used on mobile devices like smartphones. Due to limited hardware resources or limited channel capacity on mobile devices, compression of neural network models to reduce storage or transmission costs is desired. Furthermore, reduced complexity is of interest. This work investigates introducing the curvature of the loss surface in the training of artificial neural networks and analyzes its benefit for the compression of neural networks through quantization and pruning of its weights. As proof-of-concept, three small LeNet-based neural networks were trained using a novel loss function consisting of a weighted average of the cross-entropy loss and the Frobenius norm of the hessian matrix. That way both, the loss as well as the local curvature, are minimized concurrently. Using the proposed method, mean test accuracies on the MNIST and FashionMNIST datasets after quantization were considerably improved by up to about 47.6 % for 1 bit quantization on MNIST and about 27.8 % on FashionMNIST compared to quantization after training without curvature information. Additionally, pruning was found to benefit from introducing curvature in the training as well with an increase of up to about 14.6 % mean test accuracy compared to pruning after training without curvature except for isolated cases. Training the artificial neural networks first without curvature information and subsequent training by only one epoch using curvature information allowed to increase the mean test accuracy after quantization at 1 bit by about 16 %. The proposed method can potentially improve the accuracy after compression irrespective of the compression method applied.

Full Text