Knowledge-based detection of events in video streams from salient regions of activity

Nicolas Moënne-Loccoz,Eric Bruno,Stéphane Marchand-Maillet

doi:10.1007/s10044-004-0235-0

Abstract

Visual events occurring in video streams (such as human postures or more complex activities) are detected from a robust and generic region-based representation of the visual content and inferred using a spatio-temporal language that integrates domain-specific knowledge. More specifically, salient regions of activity are first extracted from the dynamic of the salient points along the scene. They are mapped to a vocabulary of the domain, using a state-of-the-art classifier, to describe the visual content in terms of semantic facts. Occurrences of events, modelled as assertions of a language representing spatio-temporal relationships between facts, are inferred from the description of videos by applying a forward-reasoning engine. An application to visual events retrieval in videos of meetings is presented as a test case.

Full Text