Abstract

At present, the one-stage detector based on the lightweight model can achieve real-time speed, but the detection performance is challenging. To enhance the discriminability and robustness of the model extraction features and improve the detector’s detection performance for small objects, we propose two modules in this work. First, we propose a receptive field enhancement method, referred to as adaptive receptive field fusion (ARFF). It enhances the model’s feature representation ability by adaptively learning the fusion weights of different receptive field branches in the receptive field module. Then, we propose an enhanced up-sampling (EU) module to reduce the information loss caused by up-sampling on the feature map. Finally, we assemble ARFF and EU modules on top of YOLO v3 to build a real-time, high-precision and lightweight object detection system referred to as the ARFF-EU network. We achieve a state-of-the-art speed and accuracy trade-off on both the Pascal VOC and MS COCO data sets, reporting 83.6% AP at 37.5 FPS and 42.5% AP at 33.7 FPS, respectively. The experimental results show that our proposed ARFF and EU modules improve the detection performance of the ARFF-EU network and achieve the development of advanced, very deep detectors while maintaining real-time speed.

Highlights

  • Object detection is the most fundamental task in the computer vision community and has attracted researchers’ attention in different fields

  • We propose the enhanced up-sampling (EU) module to reduce the information loss caused by up-sampling in the lateral connection and enhance the representation ability of the feature pyramid

  • To eliminate the information loss caused by up-sampling, enhance the semantic information of the shallow feature maps in the feature pyramid and improve small object detection performance, we propose the EU module in this study

Read more

Summary

Introduction

Object detection is the most fundamental task in the computer vision community and has attracted researchers’ attention in different fields. Object detection is widely used in video surveillance [1] and self-driving [2] and forms a key component of many other visual tasks, such as scene understanding [3,4] and image guidance [5]. Some studies [8,9,10] have enhanced the ability of the model to perform feature extraction, selection and fusion. It is still a challenge to carry out real-time and high-precision detection for objects at different scales. At present, advanced detectors improve the performance of object detection by constructing deep feature pyramids and spatially receptive field modules

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call