Abstract

Pedestrian attributes carry broad information of abstract and detailed annotations. Pedestrian attribute recognition (PAR) can generate high-level semantic feature maps and provide auxiliary information in tasks such as person retrieval and re-identification. Since pedestrian attributes have broad categories and appear in complex region combinations in video sequences, enhancing the feature representation of fine-grained attributes is the key to improving pedestrian attribute recognition. This paper proposes an end-to-end feature-enhanced multi-scale dual-branch model for multiple attributes recognition (FEMDAR) with a feature-enhanced block (FEB) and multi-scale modules. The FEB module fuses the contextual features of each level, stacking different rates of dilated convolutions to expand the receptive field of the feature expression effectively. Simultaneously, the feature-enhanced block module utilizes a dual-branch structure to incorporate enhanced feature representation. Moreover, a new residual module is designed with multi-scale modules to extract the inconsistency in various attribute scales. Hence, the proposed model can achieve robust feature representation that can flexibly adapt to multiple pedestrian attributes of different scales. We validate our model on three public large-scale pedestrian attribute datasets. The experimental results show that the FEMDAR model shows prominent advantages in instance-based measurements. Furthermore, the ablation study shows the effectiveness of the proposed FEB and multi-scale modules in feature presentation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call