Large variance of scales is still a notable challenge for vehicle detection in complex traffic scenes. Though Faster RCNN based two-stage detectors have demonstrated superiority on generic object detection, its effectiveness on vehicle detection in real applications is not aware of, especially for vehicles of tiny scales. In this paper, a novel two-stage detector is proposed to perform tiny vehicle detection with high recall. It consists of a backward feature enhancement network (BFEN) and a spatial layout preserving network (SLPN). At the first stage, less attention has been paid for generating proposals with high recall, which, however, plays a significant role in recalling vehicles of tiny scales. By fully exploiting the feature representation power of a backbone network, the BFEN aims at generating high-quality region proposals for vehicles of various scales. Even with only 100 proposals, the proposed BFEN achieves an encouraging recall rate over 99%. For a better localization of tiny vehicles, we argue that the spatial layout of the ROI features also plays a significant role in the second stage. Accordingly, a light-weight detection sub-network called SLPN is designed to progressively integrate ROI features, while preserving the spatial layouts. Experiments done on the challenging DETRAC vehicle detection dataset show that the proposed method significantly improves a competitive baseline (ResNet-50 based Faster RCNN) by 16.5% mAP. Comparative performance to the state of the arts are also achieved on both the DETRAC and KITTI benchmarks.
Read full abstract