As an effective tool for network compression, pruning techniques have been widely used to reduce the large number of parameters in deep neural networks (NNs). Nevertheless, unstructured pruning has the limitation of dealing with the sparse and irregular weights. By contrast, structured pruning can help eliminate this drawback but it requires complex criteria to determine which components to be pruned. Therefore, this paper presents a new method termed BUnit-Net, which directly constructs compact NNs by stacking designed basic units, without requiring additional judgement criteria anymore. Given the basic units of various architectures, they are combined and stacked systematically to build up compact NNs which involve fewer weight parameters due to the independence among the units. In this way, BUnit-Net can achieve the same compression effect as unstructured pruning while the weight tensors can still remain regular and dense. We formulate BUnit-Net in diverse popular backbones in comparison with the state-of-the-art pruning methods on different benchmark datasets. Moreover, two new metrics are proposed to evaluate the trade-off of compression performance. Experiment results show that BUnit-Net can achieve comparable classification accuracy while saving around 80% FLOPs and 73% parameters. That is, stacking basic units provides a new promising way for network compression.
Read full abstract