Abstract

In this paper, we describe a framework for video analysis and a method to detect and understand the class of we refer to as and merge events from single or multiple video streams. We start with automatic detection of scene changes, including camera operations such as zoom, pan, tilts and scene cuts. For each new scene, camera calibration is performed, the scene geometry is estimated, to determine the absolute positions for each detected object. Objects in the video scenes are detected using an adaptive background subtraction method and tracked over consecutive frames. Objects are detected and tracked in a way to identify the key split and merge behaviors where one object splits into two or more objects and two or more objects merge into one object. We have identified split and merge behaviors as the key behavior components for several higher level activities such package drop-off, exchange between people, people getting out of cars or forming crowds etc. We embed the data about scenes, camera parameters, object features, positions into the video stream as metadata to correlate, compare and associate the results for several related scenes and achieve better video event understanding. This location for the detailed syntactic information allows it to be physically associated with the video itself and guarantees that analysis results will be preserved while in archival storage or when sub-clips are created for distribution to other users. We present some preliminary results over single and multiple video streams.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call