Abstract

Automatically counting vehicles in complex traffic scenes from videos is challenging. Detection and tracking algorithms may fail due to occlusions, scene clutters, and large variations of viewpoints and vehicle types. We propose a new approach of counting vehicles through exploiting contextual regularities from scene structures. It breaks the problem into simpler problems, which count vehicles on each path separately. The model of each path and its source and sink add strong regularization on the motion and the sizes of vehicles and can thus significantly improve the accuracy of vehicle counting. Our approach is based on tracking and clustering feature points and can be summarized in threefold. First, an algorithm is proposed to automatically learn the models of scene structures. A traffic scene is segmented into local semantic regions by exploiting the temporal cooccurrence of local motions. Local semantic regions are connected into global complete paths using the proposed fast marching algorithm. Sources and sinks are estimated from the models of semantic regions. Second, an algorithm is proposed to cluster trajectories of feature points into objects and to estimate average vehicle sizes at different locations from initial clustering results. Third, trajectories of features points are often fragmented due to occlusions. By integrating the spatiotemporal features of trajectory clusters with contextual models of paths and sources and sinks, trajectory clusters are assigned into different paths and connected into complete trajectories. Experimental results on a complex traffic scene show the effectiveness of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call