Abstract

Pedestrian detection is a crucial task in many vision-based applications, such as video surveillance, human activity analysis and autonomous driving. Recently, most of the existing pedestrian detection frameworks only focus on the detection accuracy or model parameters. However, how to balance the detection accuracy and model parameters, is still an open problem for the practical application of pedestrian detection. In this paper, we propose a parallel, lightweight framework for pedestrian detection, named ParallelNet. ParallelNet consists of four branches, each of them learns different high-level semantic features. We fused them into one feature map as the final feature representation. Subsequently, the Fire module, which includes Squeeze and Expand parts, is employed for reducing the model parameters. Here, we replace some convolution modules in the backbone with Fire modules. Finally, the focal loss is led into the ParallelNet for end-to-end training. Experimental results on the Caltech–Zhang dataset and KITTI dataset show that: Compared with the single-branch network, such as ResNet and SqueezeNet, ParallelNet has improved detection accuracy with fewer model parameters and lower Giga Floating Point Operations (GFLOPs).

Highlights

  • Pedestrian detection is an active research area in object detection [1]

  • The first is based on the feature engineering, with this type of method being consistent with “feature extractor + classifier” methods, such as Histogram of Oriented Gradient (HOG) + Support Vector Machines (SVM) [7], Integral Channel

  • With the goal of improving detection accuracy, we considered how to reduce the number of model parameters

Read more

Summary

Introduction

Pedestrian detection is an active research area in object detection [1]. The purpose of this study is to detect all pedestrians in each frame and locate their position, for applications in video surveillance [2], motion detection [3], intelligent transportation [4] and autonomous driving [5,6]. The first is based on the feature engineering, with this type of method being consistent with “feature extractor + classifier” methods, such as Histogram of Oriented Gradient (HOG) + Support Vector Machines (SVM) [7], Integral Channel. Features (ICF) + AdaBoost [8] and Deformable Part Model (DPM) + Latent Support Vector Machines (LatSVM) [9]. These methods still have limitations in their performance in engineering applications, generalization performance [10]. The second is based on the neural network method. With the development of neural networks [11,12,13], general object detection algorithms are used in pedestrian detection, and can be further divided into two categories: Anchor-based and anchor-free

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call