Abstract
Keeping an eye on pedestrians as they navigate through a scene, surveillance cameras are everywhere. With this context, our paper addresses the problem of pedestrian attribute recognition (PAR). This problem entails recognizing attributes such as age-group, clothing style, accessories, footwear style etc. This multi-label problem is extremely challenging even for human observers and has rightly garnered attention from the computer vision community. Towards a solution to this problem, in this paper, we adopt trainable Gabor wavelets (TGW) layers and cascade them with a convolution neural network (CNN). Whereas other researchers are using fixed Gabor filters with the CNN, the proposed layers are learnable and adapt to the dataset for a better recognition. We propose a two-branch neural network where mixed layers, a combination of the TGW and convolutional layers, make up the building block of our deep neural network. We test our method on twoo challenging publicly available datasets and compare our results with state of the art.
Highlights
Pedestrian attribute recognition (PAR) is one of the active areas of research in computer vision
Recurrent Convolutional (RC) model mines the correlations among different attribute groups, while the intra-group attention correlation and intra group spatial locality is used by the Recurrent Attention (RA) model to improve the performance and robustness of pedestrian attribute recognition
We evaluate the effectiveness of the proposed method on both PEdesTrian Attribute (PETA) and Richly Annotated Pedestrian (RAP) datasets
Summary
Pedestrian attribute recognition (PAR) is one of the active areas of research in computer vision. The orientation of a person or a camera can hide a backpack, for example, partially or completely from a particular view These examples clearly show that the setup of an acquisition environment for image or video capture results in a high intra-class variations for the same visual attributes. The combination of low image resolution, in addition to the self-occlusions or view-oriented occlusions, makes visual attribute identification a very challenging problem Many of these issues can be seen in the most widely used pedestrian datasets. The dataset shows a large variation in the attributes due to pedestrian appearance, viewpoints and severe occlusions After analyzing these datasets, it is observed that visual attributes identification from these images is a difficult task due to the very low quality of the images. We report on the various datasets and the associated results are mentioned in the section Evaluation, before our final remarks in the Conclusion
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.