Abstract

This paper studies the matching problem of cross-modality video data from a discrete distribution alignment view. Central to this discussion is the visible-infrared person re-identification (VI-reID), a crucial feature that bolsters surveillance systems’ efficacy in monitoring individuals across diverse lighting conditions. Going beyond traditional image-to-image matching paradigms, a recent study shows that temporal information can bring richer cues to encode the pedestrian representation, improving the representation power of deep neural networks. However, this integration further complicates cross-modality data matching due to the joint processing of spatial and temporal information. This paper formulates the video data as a discrete distribution and aligns the cross-modality video representation by reducing the matching cost between the two distributions. To this end, a natural idea for aligning the videos is to reduce the divergence of distributions. Moreover, the powerful optimal transport (OT) scheme, which generates the optimal matching flows and establishes the relevance of two distributions, is also employed as a way to measure the distance of distributions. Nevertheless, we observe that endowing the OT in the advanced VI-reID feature extractor leads to a non-symmetric measurement. To mitigate this, the paper introduces a new metric, namely symmetric optimal transport (SOT), reformulating OT into a symmetric form. Thorough analyses and empirical studies affirm the superiority of the proposed SOT, which significantly outperforms the current state-of-the-art methods according to standard benchmarking evaluations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.