Model-Based Human Motion Tracking and Behavior Recognition Using Hierarchical Finite State Automata

Jihun Park,J K Aggarwal,Sunghun Park

doi:10.1007/978-3-540-24768-5_33

Abstract

AbstractThe generation of motion of an articulated body for computer animation is an expensive and time-consuming task. Recognition of human actions and interactions is important to video annotation, automated surveillance, and content-based video retrieval. This paper presents a new model-based human-intervention-free approach to articulated body motion tracking and recognition of human interaction using static-background monocular video sequences. This paper presents two major applications based on basic motion tracking: motion capture and human behavior recognition.To determine a human body configuration in a scene, a 3D human body model is postulated and projected on a 2D projection plane to overlap with the foreground image silhouette. We convert the human model body overlapping problem into a parameter optimization problem to avoid the kinematic singularity problem. Unlike other methods, our body tracking does not need any user intervention. A cost function is used to estimate the degree of the overlapping between the foreground input image silhouette and a projected 3D model body silhouette. The configuration the best overlap with the foreground of the image least overlap with the background is sought. The overlapping is computed using computational geometry by converting a set of pixels from the image domain to a polygon in the 2D projection plane domain.We recognize human interaction motion using hierarchical finite state automata (FA). The model motion data we get from tracking is analyzed to get various states and events in terms of feet, torso, and hands by a low-level behavior recognition model. The recognition model represents human behaviors as sequences of states that classify the configuration of individual body parts in space and time. To overcome the exponential growth of the number of states that usually occurs in a single-level FA, we present a new hierarchical FA that abstracts states and events from motion data at three levels: the low-level FA analyzes body parts only, the middle-level FAs recognize motion and the high-level FAs analyze a human interaction. Motion tracking results and behavior recognition from video sequences are very encouraging.KeywordsModel BodyMotion TrackingState AutomatonVideo AnnotationParameter Optimization ProblemThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text