Abstract

Exploring multi-level information to obtain fine-grained features is the key factor to improve the performance of person re-identification (Re-ID). However, existing models for person Re-ID only focus on learning the high-level semantic information while neglecting the low-level detail information. To alleviate this issue, we propose a lightweight person Re-ID method termed Multi-Scale Semantic and Detail Extraction Network (MSDENet) to obtain robustness and discriminative feature representation for the Re-ID task. Specifically, we design a Series Channel-Spatial Attention (SCSA) and embed it into the lightweight backbone network to focus on the key parts of pedestrian images. Meanwhile, we propose a Multi-Scale Semantic and Detail Extraction (MSDE) method to extract multi-scale features of semantic information and detail information, which can effectively capture the feature diversity of pedestrian images. Furthermore, we design a Feature Enhancement Fusion (FEF), which enhances and fuses the fine-grained features of semantic extraction and detail extraction branches to better obtain the discriminative feature representation. Extensive experiments conducted on popular datasets Market1501, MSMT17, and CUHK03 demonstrate that the proposed MSDENet has competitive performance compared with the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call