SCANN: Synthesis of Compact and Accurate Neural Networks

Shayan Hassantabar,Zeyu Wang,Niraj K Jha

doi:10.1109/tcad.2021.3116470

Abstract

Deep neural networks (DNNs) have become the driving force behind recent artificial intelligence (AI) research. With the help of a vast amount of training data, neural networks can perform better than traditional machine learning algorithms in many applications. An important problem with implementing a neural network is the design of its architecture. Typically, such an architecture is obtained manually by exploring its hyperparameter space and kept fixed during training. This approach is both time consuming and inefficient. Another issue is that modern neural networks often contain millions of parameters, whereas many applications require small inference models due to imposed resource constraints, such as energy constraints on battery-operated devices. However, efforts to migrate DNNs to such devices typically entail a significant loss of classification accuracy. To address these challenges, we propose a two-step neural network synthesis methodology, called DR+SCANN, that combines two complementary approaches to design compact and accurate DNNs. At the core of our framework is the SCANN methodology that uses three basic architecture-changing operations, namely, connection growth, neuron growth, and connection pruning, to synthesize feedforward architectures with arbitrary structure. These neural networks are not limited to the multilayer perceptron structure. SCANN encapsulates three synthesis methodologies that apply a repeated grow-and-prune paradigm to three architectural starting points. DR+SCANN combines the SCANN methodology with dataset dimensionality reduction to alleviate the curse of dimensionality. We demonstrate the efficacy of SCANN and DR+SCANN on various image and nonimage datasets. We evaluate SCANN on MNIST, CIFAR-10, and ImageNet benchmarks. Without any loss in accuracy, SCANN generates a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$46.3\times $ </tex-math></inline-formula> smaller network than the LeNet-5 Caffe model. We also compare SCANN-synthesized networks with a state-of-the-art fully connected (FC) feedforward model for MNIST, and show <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$20\times $ </tex-math></inline-formula> ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$19.9\times $ </tex-math></inline-formula> ) reduction in the number of parameters (floating-point operations) with little drop in accuracy. For the CIFAR-10 dataset, we target AlexNet and VGG-16 baseline architectures. SCANN reduces the number of parameters in AlexNet by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$10.1\times $ </tex-math></inline-formula> without any drop in accuracy. It reduces the number of parameters in the FC layers of VGG-16 to only 2.5k while increasing accuracy by 1.05%. On the ImageNet dataset, for the VGG-16 and MobileNetV2 architectures, we reduce network parameters by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$8.0\times $ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.3\times $ </tex-math></inline-formula> , respectively, with a similar or improved performance over their respective baselines. We also evaluate the efficacy of using dimensionality reduction alongside SCANN (DR+SCANN) on nine small-to-medium-size datasets. Using this methodology enables us to reduce the number of connections in the network by up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5078.7\times $ </tex-math></inline-formula> (geometric mean: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$82.1\times $ </tex-math></inline-formula> ), with little to no drop in accuracy. On seven out of nine datasets, we show 0.41%–10.09% accuracy improvements over the FC baseline models. We also show that our synthesis methodology yields neural networks that are much better at navigating the accuracy versus energy efficiency space. This can enable neural network-based inference even on Internet-of-Things sensors.

Full Text