Abstract

Aiming at the problem that the embedded platform cannot meet the real-time detection of multisource images, this paper proposes a lightweight target detection network MNYOLO (MobileNet-YOLOv4-tiny) suitable for embedded platforms using deep separable convolution instead of standard convolution to reduce the number of model parameters and calculations; at the same time, the visible light target detection model is used as the pretraining model of the infrared target detection model and the infrared target data set collected on the spot is fine-tuned to obtain the infrared target detection model. On this basis, a decision-level fusion detection model is obtained to realize the complementary information of infrared and visible light multiband information. The experimental results show that it has a more obvious advantage in detection accuracy than the single-band target detection model while the decision-level fusion target detection model meets the real-time requirements and also verifies the effectiveness of the above algorithm.

Highlights

  • Target detection [1] is an important research content in the field of computer vision

  • With the rapid development of deep learning, new detection algorithms continue to emerge in the visible light environment

  • It is mainly divided into two-stage [2, 3] detection model and one-stage [4] detection model. e two-stage detection model mainly includes the R-CNN series of algorithms, which greatly improves the detection accuracy by generating suggested regions; one-stage detection models mainly include SSD [5, 6] series and YOLO [7] series, adopting a one-step framework of global regression and classification, while sacrificing certain accuracy, the detection speed gets a big improvement. e above two detection models are based on preset anchor points

Read more

Summary

Introduction

Target detection [1] is an important research content in the field of computer vision. Some visible light target detection models can achieve high accuracy due to the limitations of the computing power and memory resources of the embedded platform, it is difficult to migrate algorithms to the embedded platform and cannot adapt to the industry’s requirements for real-time and portability of target detection algorithms In response to this problem, Qi and Lu proposed the mobile net model, which uses a deeply separable convolution method to reduce network weight parameters, and in 2019, they proposed a lightweight attention model by integrating the MobileNetV1 [11] deeply separable volume, MobileNetV2 [12, 13] linear bottleneck inverse residual structure, and MobileNetV3 [14, 15] Magnet, while greatly reducing model parameters and further optimizing the target feature extraction and classification network.

The visible traindata loss curves
Region avg IoU
The infrared traindata loss curves
Select the target border with the highest confidence
Step Step
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call