Multi-View Object Detection Based on Deep Learning

Cong Tang,Wei Jin,Yongshun Ling,Chao Zheng,Xing Yang

doi:10.3390/app8091423

Cong Tang, Wei Jin + Show 3 more

Open Access

https://doi.org/10.3390/app8091423

Copy DOI

Journal: Applied Sciences	Publication Date: Aug 21, 2018
Citations: 32	License type: CC BY 4.0

Affiliation: National University of Defense Technology

Abstract

A multi-view object detection approach based on deep learning is proposed in this paper. Classical object detection methods based on regression models are introduced, and the reasons for their weak ability to detect small objects are analyzed. To improve the performance of these methods, a multi-view object detection approach is proposed, and the model structure and working principles of this approach are explained. Additionally, the object retrieval ability and object detection accuracy of both the multi-view methods and the corresponding classical methods are evaluated and compared based on a test on a small object dataset. The experimental results show that in terms of object retrieval capability, Multi-view YOLO (You Only Look Once: Unified, Real-Time Object Detection), Multi-view YOLOv2 (based on an updated version of YOLO), and Multi-view SSD (Single Shot Multibox Detector) achieve AF (average F-measure) scores that are higher than those of their classical counterparts by 0.177, 0.06, and 0.169, respectively. Moreover, in terms of the detection accuracy, when difficult objects are not included, the mAP (mean average precision) scores of the multi-view methods are higher than those of the classical methods by 14.3%, 7.4%, and 13.1%, respectively. Thus, the validity of the approach proposed in this paper has been verified. In addition, compared with state-of-the-art methods based on region proposals, multi-view detection methods are faster while achieving mAPs that are approximately the same in small object detection.

Highlights

Object detection represents a significant focus of research in the field of computer vision [1] that can be applied in driverless cars, robotics, video surveillance and pedestrian detection [2,3,4]
Traditional object detection methods are primarily based on establishing mathematical models according to prior knowledge; such methods include the Hough transform method [5], the frame-difference method [6], the background subtraction method [7], the optical flow method [8], the sliding window model method [9] and the deformable part model method [10]
This paper proposes a multi-view object detection approach based on deep learning, with the aim of improving the performance of regression-based deep learning models when detecting small objects

Summary

Introduction

Object detection represents a significant focus of research in the field of computer vision [1] that can be applied in driverless cars, robotics, video surveillance and pedestrian detection [2,3,4]. The first four of these methods all operate in a mode based on feature extraction and mathematical modeling, which utilizes certain features of the data to build a mathematical model and obtains the object detection results by solving that model, whereas the latter two methods operate in a mode based on feature extraction and classification modeling, which combines hand-crafted features Because deep neural networks can automatically learn different features, object detection based on deep learning is characterized by more abundant features and stronger feature representation capabilities than are possible with traditional hand-crafted features [16].

Objectives

Methods

Results

Conclusion