Abstract

Person re-identification (re-ID) aims at matching a person-of-interest across various non-overlap cameras with distinguished visual appearance variances. Pre-existing research methods mainly employ deep neural models to train large-scale person re-ID datasets, achieving good performance. However, these methods are primarily deployed only on visual data, which can be easily influenced by the environment variances (e.g., viewpoints, poses, and illuminations). In this paper, we propose an adaptive multi-task learning (MTL) scheme for cross domain and modal person re-ID. It can effectively utilize the visual and language information from multiple datasets for improving learning performance. Comprehensive experiments are also conducted on the widely-used person re-ID datasets, i.e., Market-1501 and DukeMTMC-reID, validating the effectiveness of the proposed method. It can model the domain difference and the relationship between the vision and language modalities and achieve state-of-the-art performance. The source code of our proposed method will be available athttps://github.com/emdata-ailab/Multitask_Learning_ReID.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call