Abstract

Objective of multiple object tracking (MOT) is to assign a unique track identity for all the objects of interest in a video, across the whole sequence. Tracking-by-detection is the most common approach used in addressing MOT problem. In this work, we propose a method to address MOT by defining a dissimilarity measure based on object motion, appearance, structure, and size. We calculate the appearance and structure-based dissimilarity measure by matching histograms following a grid architecture. Motion and size for each track are predicted using the information from track's history. These dissimilarity values are then used in the Hungarian algorithm, in the data association step for track identity assignment. In addition, we introduce a method to address any false detection in stable tracks. The proposed method runs in real time following an online approach. We evaluate our method in both MOT17 benchmark data-set for pedestrian tracking and KITTI benchmark data-set for vehicle tracking using the same system parameters to verify the robustness of the proposed method. The method can achieve state-of-the-art results in both benchmarks.

Highlights

  • Tracking is a challenging problem in many video analysis tasks where an object, is to be identified and assigned a unique identity over all the frames it appears in an image sequence

  • MOT17 training set has a total of 15,948 frames with an average of 21.1 detections per frame, while the test set contain a total of 17,757 frames with an average of 31.8 detections per frame

  • The evaluation codes by MOT17 and KITTI benchmarks were used, which are based on CLEARMOT [38] and Mostly Tracked (MT: % of ground truth trajectories which are covered by tracker output for more than 80% in length) − Mostly Lost (ML: % of ground truth trajectories which are covered by tracker output for less than 20% in length) [39] metrics

Read more

Summary

Introduction

Tracking is a challenging problem in many video analysis tasks where an object (defined by a bounding box), is to be identified and assigned a unique identity over all the frames it appears in an image sequence. Tracking research can broadly be divided in to multiple Object Tracking (MOT) and single object tracking. While MOT assumes object detection as prior knowledge, the latter tries to localize and track an unknown object that has only been described by the localization information at the first frame. The most prominent technique in single object tracking follow discriminative method, opposed to generative methods. Instead of building an object appearance model based on generative process and without considering the background [1], discriminative trackers are able to distinguish the target from negative samples by learning a classifier, which is more accurate [2]. TLD tracker [3] divides tracking process into three stages

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.