Learning, detection and representation of multi-agent events in videos

Asaad Hakeem,Mubarak Shah

doi:10.1016/j.artint.2007.04.002

Asaad Hakeem, Mubarak Shah

https://doi.org/10.1016/j.artint.2007.04.002

Copy DOI

Journal: Artificial Intelligence	Publication Date: Apr 14, 2007
Citations: 123	License type: elsevier-specific: oa user license

Affiliation: University of Central Florida

Abstract

In this paper, we model multi-agent events in terms of a temporally varying sequence of sub-events, and propose a novel approach for learning, detecting and representing events in videos. The proposed approach has three main steps. First, in order to learn the event structure from training videos, we automatically encode the sub-event dependency graph, which is the learnt event model that depicts the conditional dependency between sub-events. Second, we pose the problem of event detection in novel videos as clustering the maximally correlated sub-events using normalized cuts. The principal assumption made in this work is that the events are composed of a highly correlated chain of sub-events that have high weights (association) within the cluster and relatively low weights (disassociation) between the clusters. The event detection does not require prior knowledge of the number of agents involved in an event and does not make any assumptions about the length of an event. Third, we recognize the fact that any abstract event model should extend to representations related to human understanding of events. Therefore, we propose an extension of CASE representation of natural languages that allows a plausible means of interface between users and the computer. We show results of learning, detection, and representation of events for videos in the meeting, surveillance, and railroad monitoring domains.

Full Text