Abstract
In this article, we propose a novel deep framework termed local alignment deep network (LADN) for infrared-visible cross-modal person reidentification (IVCM ReID) in 6G-enabled IoT, which could meet the demands of all-day and real-time surveillance. The proposed LADN is designed as a two-stream structure, and it learns shallow interested feature maps and common subspace feature maps to reduce the gap between IR and RGB images. To overcome the challenge of pose and viewpoint variations of pedestrians, we learn the local features in the deep layers. We also propose the local alignment triplet loss (LAT) to align local features, which could capture the consistent local information via comparing noncorresponding local features in a certain range. Furthermore, we learn the global features to provide the global field of vision for the representation. The proposed LADN is optimized in an end-to-end way by combining different cross-modality losses. We evaluate the proposed method on two standard benchmark data sets, i.e., SYSU-MM01 and RegDB, and the results demonstrate the effectiveness of LADN.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.