Abstract

With the development of deep learning, researchers design deep network structures in order to extract rich high-level semantic information. Nowadays, most popular algorithms are designed based on the complexity of visible image features. However, compared with visible image features, infrared image features are more homogeneous, and the application of deep networks is prone to extracting redundant features. Therefore, it is important to prune the network layers where redundant features are extracted. Therefore, this paper proposes a pruning method for deep convolutional network based on heat map generation metrics. The ‘network layer performance evaluation metrics’ are obtained from the number of pixel activations in the heat map. The network layer with the lowest ‘network layer performance evaluation metrics’ is pruned. To address the problem that the simultaneous deletion of multiple structures may result in incorrect pruning, the Alternating training and self-pruning strategy is proposed. Using a cyclic process of pruning each model once and retraining the pruned model to reduce the incorrect pruning of network layers. The experimental results show that proposed method in this paper improved the performance of CSPDarknet, Darknet and Resnet.

Highlights

  • With the widespread use of computer vision, deep learning networks based on multimodal data (visible images (RGB images), depth images, infrared images) are applied to various fields, including object detection [1–5] and classification [6,7], image segmentation [8–10], target tracking [11–13], etc

  • In this Sensors 2022, 22, 2022 paper, we propose a formula that can quantitatively describe the feature extraction performance of each network layer, including ‘foreground feature extraction capability metrics’ (F), ‘background feature suppression capability metrics’ (B), ‘network layer performance evaluation metrics’ (NPE)

  • This chapter begins with details of the experimental environment and dataset, including the dataset, experimental environment and experimental evaluation metrics

Read more

Summary

Introduction

With the widespread use of computer vision, deep learning networks based on multimodal data (visible images (RGB images), depth images, infrared images) are applied to various fields, including object detection [1–5] and classification [6,7], image segmentation [8–10], target tracking [11–13], etc. While most of the mainstream algorithms are designed based on RGB images and show better performance when extracting key features from RGB images. The deeper the network layers of a deep learning model, the better the image feature extraction. For images such as infrared images, which only have simple features such as edge contours, applying a deeper network to extract features will produce the problem of over-fitting. The network layers where redundant features are extracted need to be pruned to improve the network’s ability to extract key features from images and reduce the loss of important features. The second type of pruning method focuses on the subjective assessment of good or bad network structures through visual images [25–27].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call