Actlets: A novel local representation for human action recognition in video

Muhammad Muneeb Ullah,Ivan Laptev

doi:10.1109/icip.2012.6466975

Abstract

This paper addresses the problem of human action recognition in realistic videos. We follow the recently successful local approaches and represent videos by means of local motion descriptors. To overcome the huge variability of human actions in motion and appearance, we propose a supervised approach to learn local motion descriptors - actlets - from a large pool of annotated video data. The main motivation behind our method is to construct action-characteristic representations of body joints undergoing specific motion patterns while learning invariance with respect to changes in camera views, lighting, human clothing, and other factors. We avoid the prohibitive cost of manual supervision and show how to learn actlets automatically from synthetic videos of avatars driven by the motion-capture data. We evaluate our method and show its improvement as well as its complementarity to existing techniques on the challenging UCF-Sports and YouTube-Actions datasets.

Full Text