Abstract
Temporal segmentation of human motion into actions is central to the understanding and building of computational models of human motion and activity recognition. Several issues contribute to the challenge of temporal segmentation and classification of human motion. These include the large variability in the temporal scale and periodicity of human actions, the complexity of representing articulated motion, and the exponential nature of all possible movement combinations. We provide initial results from investigating two distinct problems -classification of the overall task being performed, and the more difficult problem of classifying individual frames over time into specific actions. We explore first-person sensing through a wearable camera and inertial measurement units (IMUs) for temporally segmenting human motion into actions and performing activity classification in the context of cooking and recipe preparation in a natural environment. We present baseline results for supervised and unsupervised temporal segmentation, and recipe recognition in the CMU-multimodal activity database (CMU-MMAC).
Highlights
Temporal segmentation of human motion into actions is central to the understanding and building computational models of human motion and activity recognition
In this work we explore the use of Inertial Measurement Units (IMUs) and a first-person camera for overall task classification, action segmentation and action classification in the context of cooking and preparing recipes in an unstructured environment
As a first step to exploring this space, we investigate the feasibility of standard supervised and unsupervised Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), and K-Nearest Neighbor (K-NN) techniques for action segmentation and classification on these two modalities
Summary
Temporal segmentation of human motion into actions is central to the understanding and building computational models of human motion and activity recognition. Previous research has shown promising results, recognizing human activities and factorizing human motion into primitives and actions (i.e. temporal segmentation) is still an unsolved problem in human motion analysis. In this work we explore the use of Inertial Measurement Units (IMUs) and a first-person camera for overall task classification, action segmentation and action classification in the context of cooking and preparing recipes in an unstructured environment. This paper provides baseline results for recipe classification, action segmentation and action classification on the Carnegie Mellon University Multimodal Activity (CMUMMAC) database [6].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have