Abstract

Object detection is a significant issue in visual surveillance. Faster region-based convolutional neural network (R-CNN) is a typical object detection algorithm of deep learning; however, neither its generalization ability nor its detection accuracy of small object is high. In this paper, an effective object detection algorithm is proposed for the small and occluded objects, which is based on multi-layer convolution feature fusion (MCFF) and online hard example mining (OHEM). First, the candidate regions are generated with region proposal network optimized by MCFF. Then, an effective OHEM algorithm is employed to train the region-based ConvNet detector. The hard examples are automatically selected to improve training efficiency. The avoidance of invalid examples accelerates the convergence speed of the model training. The experiments are performed on KITTI data set in intelligent traffic scenario. The proposed method outperforms the popular methods, such as Faster R-CNN, Regionlets, in terms of the overall detection accuracy. Furthermore, our method is good at the detection of small and occluded objects.

Highlights

  • Object detection, as a remarkably important research field in computer vision, provides crucial information for the semantic understanding of image and video [1], [2]

  • In order to improve the accuracy of object detection, especially the detection of small objects, we propose a novel object detection algorithm based on multi-layer convolution feature fusion (MCFF) and online hard example mining (OHEM) [33]

  • DATA SET AND EVALUATION CRITERIA The detection models are trained and the experiments are performed on the benchmark data set, namely KITTI data set in intelligent traffic scenario

Read more

Summary

Introduction

As a remarkably important research field in computer vision, provides crucial information for the semantic understanding of image and video [1], [2]. Object detection algorithms can be briefly categorized into two classes, namely classical methods and deeplearning-based methods [12]–[14]. Classical algorithms include sliding window selection [15], [16], manual feature design [17]–[19], and classifier design [20]. The candidate regions are generated with sliding windows of different sizes, and the features in the candidate regions are extracted by manual design. The representative methods of the former include Single Shot MultiBox Detector (SSD) [21] and You Only Look Once (YOLO) [22], [23]; the representative methods of the latter include Region-based Convolutional Neural Network (R-CNN) [24], SPP-Net [25], Fast R-CNN [26] and Faster R-CNN [27]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.