Abstract

Great achievements have been made in pedestrian detection through deep learning. For detectors based on deep learning, making better use of features has become the key to their detection effect. While current pedestrian detectors have made efforts in feature utilization to improve their detection performance, the feature utilization is still inadequate. To solve the problem of inadequate feature utilization, we proposed the Multi-Level Feature Fusion Module (MFFM) and its Multi-Scale Feature Fusion Unit (MFFU) sub-module, which connect feature maps of the same scale and different scales by using horizontal and vertical connections and shortcut structures. All of these connections are accompanied by weights that can be learned; thus, they can be used as adaptive multi-level and multi-scale feature fusion modules to fuse the best features. Then, we built a complete pedestrian detector, the Adaptive Feature Fusion Detector (AFFDet), which is an anchor-free one-stage pedestrian detector that can make full use of features for detection. As a result, compared with other methods, our method has better performance on the challenging Caltech Pedestrian Detection Benchmark (Caltech) and has quite competitive speed. It is the current state-of-the-art one-stage pedestrian detection method.

Highlights

  • Pedestrian detection, which is a very important problem in computer vision, is very critical in many practical fields, such as the automatic driving and security fields

  • We propose our original MultiScale Feature Fusion Unit (MFFU) as an adaptive feature fusion module, which makes it possible for the model to fuse the feature maps with different scales in adequate proportions

  • It can further adaptively fuse features that have been fused in feature fusion modules, such as MFFUs, at multiple levels; In order to solve the problem of inadequate utilization of features of other works, we propose a complete pedestrian detector based on our adaptive feature fusion idea

Read more

Summary

Introduction

Pedestrian detection, which is a very important problem in computer vision, is very critical in many practical fields, such as the automatic driving and security fields. It can further adaptively fuse features that have been fused in feature fusion modules, such as MFFUs, at multiple levels; In order to solve the problem of inadequate utilization of features of other works, we propose a complete pedestrian detector based on our adaptive feature fusion idea. It can adaptively fuse multi-scale features at multiple levels, and has been proved to be a state-of-the-art one-stage pedestrian detector with competitive speed on the challenging Caltech benchmark

Two-Stage Detection
Anchor-Based One-Stage Detection
Anchor-Free One-Stage Detection
Preliminary
Overall Architecture
Backbone Module
Detection Head
Loss Function
Inference
Datasets
Evaluate Standard
Training
Backbone Networks
Ablation Study
Method
Comparisons with Other Methods
Findings
Discussion and Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call