Abstract

We propose a novel real-time multiple pedestrian tracker for videos acquired from both static and moving cameras in unconstrained real-world environment. In such scenes, trackers always suffer from noisy detections and frequent occlusions. Existing methods usually use complex learning approaches and a large number of training samples to get discriminative appearance features. However, this leads to high computational cost and hardly works in occlusions (missing detections) and undistinguishable appearance. Addressing this, we design a light two-stage tracker. Firstly, a shallow net with two layers of full convolution is proposed to encode appearance. Compared with other deep architectures and sophisticated learning approaches, our shallow net is efficient and robust enough without any online updating. Secondly, we design a motion model to deal with noisy detections and missing objects caused by motion blur or occlusion. By mining the motion pattern, our tracker can reliably predict the object location under challenging scenes. Furthermore, we propose a speedup version to verify our robustness and the possibility of using in online applications. Extensive experiments are implemented on multiple object tracking benchmarks, MOT15 and MOT17. The performance is competitive over a number of state-of-the-art trackers and demonstrates that our tracker is very promising for real-time applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call