Modeling human motion using manifold learning and factorized generative models

Ahmed Elgammal ,Chan-Su Lee

doi:10.7282/t39s1rgk

Abstract

Modeling the dynamic shape and appearance of articulated moving objects is essential for human motion analysis, tracking, synthesis, and other computer vision problems. Modeling the shape and appearance of human motion is challenging due to the high dimensionality of the articulated human motion, variations of shape and appearance from different views and in different people, and the nonlinearity in shape and appearance deformations in the observed sequences. Recent interest in modeling human motion is originated from the various potential real-world applications such as visual surveillance, human-computer interaction, video analysis, computer animation, etc. We present a novel framework to model dynamic shape and appearance using nonlinear manifold embedding and factorization. We investigate different representations to embed high-dimensional human motion sequences in low dimensional spaces by supervised and unsupervised manifold learning techniques to achieve representations that capture the intrinsic structure of the motion. Nonlinear dimensionality reduction techniques based on visual data and kinematic data are applied to discover low dimensional intrinsic manifold representation for body configuration. Also, we investigate the use of supervised manifold learning from a known manifold topology to model deformation of manifolds from an ideal case. By learning nonlinear mapping from the embedding space to the input shape or appearance, we can generate shape and appearance sequences according to the motion state on the embedded manifold. We present a decomposable generative model to analyze shape and appearance variations by different factors such as person’s style, motion type, and view point. We use multilinear analysis in the nonlinear mapping coefficient space to factorize shape and appearance variations. Also, we investigate learning generative models to represent continuous body configuration and continuous view manifolds in a product space (i.e. body configuration manifold × view manifold). The proposed factorized generative models provide rich models for the analysis of dynamic shape and appearance of human motion. We applied the model in computer vision problems such as inferring 3D body pose from 2D images, tracking human motion with continuous view variations within the Bayesian framework, and gait recognition. We also applied our model for facial expression analysis, tracking, recognition and synthesis.

Full Text