Abstract
In this work, we address the problem of multi-vehicle detection and tracking for traffic monitoring applications. We preset a novel intelligent visual sensor for tracking-by-detection with simultaneous pose estimation. Essentially, we adapt an Extended Kalman Filter (EKF) to work not only with the detections of the vehicles but also with their estimated coarse viewpoints, directly obtained with the vision sensor. We show that enhancing the tracking with observations of the vehicle pose, results in a better estimation of the vehicles trajectories. For the simultaneous object detection and viewpoint estimation task, we present and evaluate two independent solutions. One is based on a fast GPU implementation of a Histogram of Oriented Gradients (HOG) detector with Support Vector Machines (SVMs). For the second, we adequately modify and train the Faster R-CNN deep learning model, in order to recover from it not only the object localization but also an estimation of its pose. Finally, we publicly release a challenging dataset, the GRAM Road Traffic Monitoring (GRAM-RTM), which has been especially designed for evaluating multi-vehicle tracking approaches within the context of traffic monitoring applications. It comprises more than 700 unique vehicles annotated across more than 40.300 frames of three videos. We expect the GRAM-RTM becomes a benchmark in vehicle detection and tracking, providing the computer vision and intelligent transportation systems communities with a standard set of images, annotations and evaluation procedures for multi-vehicle tracking. We present a thorough experimental evaluation of our approaches with the GRAM-RTM, which will be useful for establishing further comparisons. The results obtained confirm that the simultaneous integration of vehicle localizations and pose estimations as observations in an EKF, improves the tracking results.
Highlights
Many intelligent transportation systems need a robust and fast sensor for detecting and tracking multiple vehicles
A preliminary version of this work was published in Reference [28]. For this journal paper: (a) we have extended the technical and theoretical analysis of the multi-vehicle tracking solution; (b) we incorporate a novel model for the object detection and pose estimation, that is, the one based on the Faster R-CNN; (c) we have made a significant extension of the experimental validation; and (d) we detail and publicly release a revised version of the GRAM-RTM dataset
We report the results of a simple Extended Kalman Filter (EKF), with the same dynamic model but where the pose of the object is not recovered through the detector
Summary
Many intelligent transportation systems need a robust and fast sensor for detecting and tracking multiple vehicles. In this work we introduce a new intelligent vision based sensor able to perform multi-vehicle tracking by a joint object detection and coarse viewpoint estimation. In a multi-vehicle tracking-by-detection approach, a fundamental part of the system pipeline is the object detection step. Sensors 2019, 19, 4062 the observations for the viewpoints of the vehicles, that is, the pose of the vehicle with respect to the camera. Can we recover this information jointly during the detection step in a fast way? How can we efficiently integrate these pose observations into the tracking model? Can we recover this information jointly during the detection step in a fast way? How can we efficiently integrate these pose observations into the tracking model? These are some of the questions we want to answer with this work
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.