Abstract

In the traditional person re-identification model, the CNN network is usually used for feature extraction. When converting the feature map into a feature vector, a large number of convolution operations are used to reduce the size of the feature map. In CNN, since the receptive field of the latter layer is obtained by convolution operation on the feature map of the previous layer, the size of this local receptive field is limited, and the computational cost is large. For these problems, combined with the self-attention characteristics of Transformer, an end-to-end person re-identification model (twinsReID) is designed that integrates feature information between levels in this article. For Transformer, the output of each layer is the correlation between its previous layer and other elements. This operation is equivalent to the global receptive field because each element needs to calculate the correlation with other elements, and the calculation is simple, so its cost is small. From these perspectives, Transformer has certain advantages over CNN's convolution operation. This paper uses Twins-SVT Transformer to replace the CNN network, combines the features extracted from the two different stages and divides them into two branches. First, convolve the feature map to obtain a fine-grained feature map, perform global adaptive average pooling on the second branch to obtain the feature vector. Then divide the feature map level into two sections, perform global adaptive average pooling on each. These three feature vectors are obtained and sent to the Triplet Loss respectively. After sending the feature vectors to the fully connected layer, the output is input to the Cross-Entropy Loss and Center-Loss. The model is verified On the Market-1501 dataset in the experiments. The mAP/rank1 index reaches 85.4%/93.7%, and reaches 93.6%/94.9% after reranking. The statistics of the parameters show that the parameters of the model are less than those of the traditional CNN model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call