Abstract

At present, there are many challenges in the field of pedestrian attribute recognition, such as small targets of some attributes, imbalanced samples, and low recognition accuracy of complex samples. In view of the above problems, we improved the model in two perspectives: 1) We proposed Feature Pyramid Attention Model (FPAM). In order to solve the problem that attributes are distributed in different locations in the pedestrian image, FPAM adopted the attention mechanism on the basis of ResNet-50, by which the model's attention could be focused on key areas of the image. As for the difficulty in small targets attributes, we adopted feature pyramid integration strategy; 2) We proposed Multi Label Focal Loss (MLFL). Referring to Binary Cross Entropy Loss Function (CE) and Weight Binary Cross Entropy Loss Function (WCE), we added the weight parameters of samples which are difficult to classify to improve the recognition accuracy, and the rate of convergence was increased. Results show that our proposed method achieves 84.83% mA, 79.37% Accuracy, 87.47% Precision, 86.09% Recall, and 86.77% F1 on PETA dataset.

Highlights

  • Pedestrian attribute recognition aims to obtain the characteristics, such as age, gender, clothing type and other characteristics of the pedestrian from pedestrian image

  • FEATURE PYRAMID ATTENTION MODEL (FPAM) 1) ATTENTION MODULE The multi-attribute recognition task is required to accurately identify dozens of properties of pedestrians (PETA data sets marked a total of 65 kinds of attributes) which are located in different locations of pedestrians

  • COMPARISON WITH STATE-OF-THE-ART METHODS The experimental results of pedestrian multi-attribute recognition network proposed in this paper on the PETA dataset are as follows: the mean accuracy based on labels is 84.83%, the accuracy based on examples is 79.37%, the precision is 87.47%, the recall rate is 86.09%, and the F1 value is 86.77%

Read more

Summary

INTRODUCTION

Pedestrian attribute recognition aims to obtain the characteristics, such as age, gender, clothing type and other characteristics of the pedestrian from pedestrian image. Previous researches studied some problems in pedestrian attribute recognition, but ignored the problems of great intra-class difference within the attribute and small targets of some attributes, which resulted in the lack of effective improvement of attribute recognition accuracy For this reason, we proposed Feature Pyramid Attention Model (FPAM), which adds CBAM attention module on the basis of ResNet, and integrates multi-level features by taking example by feature pyramid. As the PETA pedestrian attribute dataset in Introduction, it can be seen that there is a large difference in the proportion of positive and negative samples, which will lead to the gradient of the loss function being dominated by the attributes with large proportions, making the trained network have poor performance on the recognition of attributes with small proportions. The experiments combined the training set and the verification set, and a total of 11,400 images were used for the training

EVALUATION PROTOCOLS
EXPERIMENTS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call