Abstract

Person re-identification aims to match images of the same person in different scenarios. The challenge of this task is how to extract discriminative features from person images with complex background noise, severe occlusions and large pose variations. Recently, some studies have attempted to apply human semantic parsing or attention mechanisms to help capture human parts or important object regions. Despite the performance improvements, these methods either require the introduction of additional prior knowledge or ignore the connections between the human body parts. To solve the above problem, we propose a novel Visual Semantic Representation Mining and Reasoning (SRMR) block to capture more discriminative semantic features of person images. Specifically, to get rid of the restriction of the external model, we propose to directly mine the clustering relationship between each local feature in the global structure and obtain the discriminative region in the person image by voting. Then, to establish the relationship between person body parts, we utilize Graph Convolutional Networks (GCN) to effectively construct the correlation between body parts. Extensive ablation studies demonstrate that our SRMR block can significantly improve the feature representation power and achieve state-of-the-art performance on several popular benchmarks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call