Abstract

Vision-based multi-object tracking has many potential applications in intelligent transportation systems and intelligent vehicles. Tracking by detection, as a popular approach to multi-object tracking, first obtains detection responses from video sequence and then associates them into tracks for every object. Existing tracking-by-detection methods can work well in constrained scenarios. However, in those complicated scenarios with occlusion and adverse illumination conditions, the detection stage is deteriorated and thus makes it difficult to track objects accurately. In this paper, we present a robust tracker that represents object appearance using stable temporal features and associates the detection responses through a two-step association process. We propose to use Bi-LSTM (Bidirectional Long Short-Term Memory) to model object appearance and obtain reliable temporal features. Then, we estimate the affinity between tracks and detections based on multiple cues including appearance, motion and shape, and integrate the affinity into a two-step association procedure. Our method is verified on MOT datasets and the experimental results are promising as compared to the state-of-the-art.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call