Abstract

One of the key tasks for an intelligent visual surveillance system is to automatically re-identify objects of interest, e.g., persons or vehicles, from nonoverlapping camera views. This demand incurs the vast investigation of person re-identification (re-ID) and vehicle re-ID techniques, especially those deep learning-based ones. While most recent algorithms focus on designing new convolutional neural networks, less attention is paid to the loss functions, which are of vital roles as well. Triplet loss and softmax loss are the two losses that are extensively used, both of which, however, have limitations. Triplet loss optimizes the model to produce features with which samples from the same class have higher similarity than those from different classes. The problem of triplet loss is that the number of triplets to be constructed grows cubically with training samples, which causes scalability issue, unstable performance, and slow convergence. Softmax loss has favorable scalable property and is widely used for large-scale classification problems. However, since Softmax loss only aims to separate well training classes, its performance for re-ID tasks is not desirable because the model is tested to measure the similarity of samples from unseen classes. We propose the support neighbor (SN) loss, which avoids the limitations of the abovementioned two losses. Unlike triplet loss that is calculated based on triplets, SN loss is derived from K -nearest neighbors (SNs) of anchor samples. The SNs of an anchor are unique, containing more valuable contextual information and neighborhood structure of the anchor, and thus contribute to more stable performance and reliable embedding from image space to feature space. Based on the SNs, a softmax-like separation term and a squeeze term are proposed, which encourage interclass separation and intraclass compactness, respectively. Experiments show that SN loss surpasses triplet and softmax losses with the same backbone network and reaches the state-of-the-art performance for both person and vehicle re-ID using a ResNet50 backbone when combined with training tricks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.