Deep neural networks (DNNs) have demonstrated remarkable performance in many fields, and deploying them on resource-limited devices has drawn more and more attention in industry and academia. Typically, there are great challenges for intelligent networked vehicles and drones to deploy object detection tasks due to the limited memory and computing power of embedded devices. To meet these challenges, hardware-friendly model compression approaches are required to reduce model parameters and computation. Three-stage global channel pruning, which involves sparsity training, channel pruning, and fine-tuning, is very popular in the field of model compression for its hardware-friendly structural pruning and ease of implementation. However, existing methods suffer from problems such as uneven sparsity, damage to the network structure, and reduced pruning ratio due to channel protection. To solve these issues, the present article makes the following significant contributions. First, we present an element-level heatmap-guided sparsity training method to achieve even sparsity, resulting in higher pruning ratio and improved performance. Second, we propose a global channel pruning method that fuses both global and local channel importance metrics to identify unimportant channels for pruning. Third, we present a channel replacement policy (CRP) to protect layers, ensuring that the pruning ratio can be guaranteed even under high pruning rate conditions. Evaluations show that our proposed method significantly outperforms the state-of-the-art (SOTA) methods in terms of pruning efficiency, making it more suitable for deployment on resource-limited devices.
Read full abstract