Abstract

Aiming at the shortcomings of the current YOLOv3 model, such as large size, slow response speed, and difficulty in deploying to real devices, this paper reconstructs the target detection model YOLOv3, and proposes a new lightweight target detection network YOLOv3-promote: Firstly, the G-Module combined with the Depth-Wise convolution is used to construct the backbone network of the entire model, and the attention mechanism is introduced and added to perform weighting operations on each channel to get more key features and remove redundant features, thereby strengthening the identification ability of feature network model’s to distinguish target objects among background; Secondly, in order to delete some less important channels to achieve the effect of compressing the model size and improving the calculation speed, the size of the scaling factor gamma in the batch normalization layer is used; Finally, based on NVIDIA’s TensorRT framework model conversion and half-precision acceleration were carried out, and the accelerated model was successfully deployed on the embedded platform Jetson Nano. The performed KITTI experimental results show that the inference speed of our proposed method is about 5 times that of the original model, the parameter volume is reduced to one tenth, the mAP is increased from 86.1% of the original model to 93.1%, and the FPS reaches 25.5fps, realizing the requirements of real-time detection with high precision.

Highlights

  • N OWADAYS, the development of the Internet of vehicles [33], [34] is becoming more and more popular in world

  • We present the advantages and disadvantages regarding the parameter number and accuracy among the Squeezeand-Excitation Networks (SENet) attention mechanism [25], Convolutional Block Attention Module (CBAM) [26], and our proposed attention mechanism [27] to the backbone networks of ResNet50, ResNet101, and ResNet152

  • We carried out model reconstruction, model pruning and half-precision acceleration on the classic YOLOv3 model in target detection

Read more

Summary

INTRODUCTION

N OWADAYS, the development of the Internet of vehicles [33], [34] is becoming more and more popular in world. One-stage method is a kind of regressionbased target detection, which aims to solve the problem of incompatibility between real-time and accuracy. With the rise of various neural network chips and high memory graphics cards, this can try to increase hardware’s computing power to speed up the network model. Another idea is to focus on the software. It is because of their existence that the network has a comprehensive understanding of the input data By keeping these redundant features, we propose a new model that can generate more feature maps with only some calculation-G-Modules. The most important part of the proposed model is the Depth-Wise convolution

Depth-Wise Convolution
G-Module
F L O Ps
G-Bottleneck
Attention Mechanism
Overall Structure of the Model
MODEL PRUNE
Data Set Description
Overall Process
Result of Detection
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call