Abstract

Automatically recognizing attributes such as gender, age, footwear and clothing style from pedestrian images at far distance is an important task in surveillance scenarios. However, the appearance diversity and ambiguity in these images make it a challenging task. This paper presents an end-to-end Neural Pedestrian Attribute Recognition (Neural PAR) model to address these challenges. Rather than taking it as a recognition problem like previous methods, Neural PAR formulates it as an end-to-end image to attribute description problem. To this end, the training images and their corresponding attributes are used as inputs. Specifically, the attributes are concatenated into different attribute descriptions to well contextualize the potential relationships among them. Then, a neural network model is trained based on CNN and LSTM to learn the complex relations between visual features and their corresponding attributes. Extensive experiments show that the proposed Neural PAR significantly outperforms the state-of-the-art methods on the benchmark PETA dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call