Contextual and Multi-Scale Feature Fusion Network for Traffic Sign Detection

Wei Zhang,Yandong Tang,Huijie Fan,Qiang Wang

doi:10.1109/cyber50695.2020.9279180

Abstract

The traffic sign detection, as an important part of the automatic driving system, requires high accuracy. In this paper, we proposed an end-to-end deep learning network, named the Contextual and Multi-Scale Feature Fusion Network, for traffic sign detection. The model consists of two sub-networks: the Weighted Multi-scale Feature Learning network (W-net) and the Contextual-Attention Learning network (C-net). The W-net extracts multi-scale features and calculates the weights of each feature map layer to detect traffic signs under different scales. The C-net learns the contextual attention mask of interference items and concatenates it with the multi-scale feature, which reduce the detection false efficiently. Compared with several state-of-the-art traffic sign detection methods, our proposed model outperforms others on extensive quantitative and qualitative experiments.

Full Text