Dual-stream feature fusion network for person re-identification

Wenbin Zhang,Zhaoyang Li,Haishun Du,Jiangang Tong,Zhihua Liu

doi:10.1016/j.engappai.2024.107888

Abstract

Person re-identification (Re-ID) has made significant progress in recent years. However, it still faces numerous challenges in real scenarios. Although researchers have proposed various solutions, the issue of similar clothing colors remains an obstacle in improving the performance of person re-identification. To solve this issue, we propose a dual-stream feature fusion network (DSFF-Net) to extract discriminative features from pedestrian images in two color spaces. Specifically, a dual-stream network is designed to extract RGB global features, grayscale global features, and local features of pedestrian images to increase the richness of pedestrian representations. A channel attention module is designed to direct the network to focus on the salient features of pedestrians. An embedding mixed pooling is designed, which integrates the outputs of global average pooling (GAP) and global max pooling (GMP) to obtain more discriminative global features. Besides, it can also remove redundant information and increase the discrimination of pedestrian representations. A fine-grained local feature embedding fusion operation is designed to obtain more discriminative local features by embedding and fusing fine-grained local features of RGB and grayscale pedestrian images. Since the final pedestrian representation fuses both global features and fine-grained discriminative features in RGB and grayscale spaces, DSFF-Net increases the discriminative capability and richness of pedestrian representations. Moreover, we conduct extensive experiments on three datasets, Market-1501 DukeMTMC-Reid, and CUHK03, and our method achieves the Rank-1/mAP of 95.9%/89.1%, 89.0%/79.2%, and 81.2%/78.7%, respectively. Experimental results show that the performance of DSFF-Net is better than those of most of the state-of-the-art person Re-ID methods.

Full Text