Abstract

Target detection is one of the most important research directions in computer vision. Recently, a variety of target detection algorithms have been proposed. Since the targets have varying sizes in a scene, it is essential to be able to detect the targets at different scales. To improve the detection performance of targets with different sizes, a multi-scale target detection algorithm was proposed involving improved YOLO (You Only Look Once) V3. The main contributions of our work include: (1) a mathematical derivation method based on Intersection over Union (IOU) was proposed to select the number and the aspect ratio dimensions of the candidate anchor boxes for each scale of the improved YOLO V3; (2) To further improve the detection performance of the network, the detection scales of YOLO V3 have been extended from 3 to 4 and the feature fusion target detection layer downsampled by 4× is established to detect the small targets; (3) To avoid gradient fading and enhance the reuse of the features, the six convolutional layers in front of the output detection layer are transformed into two residual units. The experimental results upon PASCAL VOC dataset and KITTI dataset show that the proposed method has obtained better performance than other state-of-the-art target detection algorithms.

Highlights

  • Target detection is one of the research hotspots in the field of computer vision

  • A multi-scale target detection approach based on YOLO V3 is proposed

  • Compared with YOLO V3, our proposed network has improved mAP by 7.47% on the PASCAL VOC dataset and by 1.77% on the KITTI dataset

Read more

Summary

Introduction

Target detection is one of the research hotspots in the field of computer vision. The location and the category of the targets can be determined by using target detection. The main contributions of our work can be concluded as follows: (1) a mathematical derivation method based on Intersection over Union (IOU) was proposed to select the number and the aspect ratio dimensions of the candidate anchor boxes for each scale of the improved YOLO V3; (2) To further improve the detection performance of the network, the detection scales of YOLO V3 have been extended from 3 to 4 and the feature fusion target detection layer downsampled by 4× is established to detect the small targets; (3) To avoid gradient fading and enhance the reuse of the features, the six convolutional layers in front of the output detection layer are transformed into two residual units; (4) We compared our approach with the state-of-the-art target detection algorithms both on the PASCAL VOC dataset and the KITTI dataset to evaluate the performance of the improved network.

Brief Introduction to YOLO V3
Our Approach
Thenetwork networkofofimproved improvedYOLO
Experiments and Results
Experiment on PASCAL
Experimenton KITTI Dataset
Experimenton
Quantitative and Qualitative Evaluation
Figures A
Conclusions
1.References
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call