Abstract

Multi-object tracking aims to estimate the complete trajectories of objects in a scene. Distinguishing among objects efficiently and correctly in complex environments is a challenging problem. In this paper, a Siamese network with an auto-encoding constraint is proposed to extract discriminative features from detection responses in a tracking-by-detection framework. Different from recent deep learning methods, the simple two layers stacked auto-encoder structure enables the Siamese network to operate efficiently only with small-scale online sample data. The auto-encoding constraint reduces the possibility of overfitting during small-scale sample training. Then, the proposed Siamese network is improved to extract the previous-appearance-next vector from tracklet for better association. The new feature integrates the appearance, previous, and next stage motions of an element in a tracklet. With the new features, an online incremental learned tracking framework is established. It contains reliable tracklet generation, data association to generate complete object trajectories, and tracklet growth to deal with missing detections and to enhance the new feature for tracklet. Benefiting from discriminative features, the final trajectories of objects can be achieved by an efficient iterative greedy algorithm. Feature experiments show that the proposed Siamese network has advantages in terms of both discrimination and correctness. The system experiments show the improved tracking performance of the proposed method.

Highlights

  • As a key technology in computer vision, multi-object tracking (MOT) has received growing attentions from researchers all over the world

  • Each detection response dit is associated with Siamese network with an auto-encoding constraint (SNAC)(dit ), which extracts discriminative features to better distinguish dit from other detections belonging to Dt+1

  • According to the order of the system framework, the performance of SNAC was first evaluated on detection responses and tested the SNAC on tracklets

Read more

Summary

Introduction

As a key technology in computer vision, multi-object tracking (MOT) has received growing attentions from researchers all over the world. Inspired by stacked auto-encoder methods [24,25], the output of the encoder layer tries to represent the input detection response as accurately as possible This is done by adding a constraint term to the loss function, called the auto-encoding constraint, which effectively prevents the network from overfitting while training with limited samples. One SNAC is trained for each detection response online, and reliable tracklets are generated mainly by the extracted features. A simple structure Siamese network with an auto-encoding constraint is proposed to extract discriminative features efficiently for objects on the scene. A tracking framework is established that includes reliable tracklet generation by incremental learning with SNAC for the detection response, tracklet growth to enhance PAN performance and deal with missing detections, and tracklet association with PAN to generate complete trajectories

Related Works
Online Learned Siamese Network with Auto-Encoding Constraint
The Structure of SNAC
Loss Function and Auto-Encoding Constraint
Denoising through the Collection of Training Samples
Iterative Tracklet Generation with SNAC by Incremental Learning
6: Train F1k with P and N
A TBD scheme can be described as solving an MAP problem by:
Previous-Appearance-Next Feature of the Tracklet
Tracklet Growing
Tracklet Association in Sliding Windows
Evaluation of SNAC
SNAC for Detection Responses
Methods
SNAC for Tracklets
Evaluation of the MOT System
Method
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.