A novel self-boosting dual-branch model for pedestrian attribute recognition

Yilu Cao,Yuchun Fang,Yaofang Zhang,Xiaoyu Hou,Kunlin Zhang,Wei Huang

doi:10.1016/j.image.2023.116961

Abstract

Pedestrian attributes carry broad information of abstract and detailed annotations. Pedestrian attribute recognition (PAR) can generate high-level semantic feature maps and provide auxiliary information in tasks such as person retrieval and re-identification. Since pedestrian attributes have broad categories and appear in complex region combinations in video sequences, enhancing the feature representation of fine-grained attributes is the key to improving pedestrian attribute recognition. This paper proposes an end-to-end feature-enhanced multi-scale dual-branch model for multiple attributes recognition (FEMDAR) with a feature-enhanced block (FEB) and multi-scale modules. The FEB module fuses the contextual features of each level, stacking different rates of dilated convolutions to expand the receptive field of the feature expression effectively. Simultaneously, the feature-enhanced block module utilizes a dual-branch structure to incorporate enhanced feature representation. Moreover, a new residual module is designed with multi-scale modules to extract the inconsistency in various attribute scales. Hence, the proposed model can achieve robust feature representation that can flexibly adapt to multiple pedestrian attributes of different scales. We validate our model on three public large-scale pedestrian attribute datasets. The experimental results show that the FEMDAR model shows prominent advantages in instance-based measurements. Furthermore, the ablation study shows the effectiveness of the proposed FEB and multi-scale modules in feature presentation.

Full Text