Inspired by the overwhelming success of Histogram of Oriented Gradients (HOG) features in many vision tasks, in this paper, we present an innovative compact feature descriptor called fuzzy Histogram of Oriented Lines (f-HOL) for action recognition, which is a distinct variant of the HOG feature descriptor. The intuitive idea of these features is based on the observation that the slide area of the human body skeleton can be viewed as a spatiotemporal 3D surface, when observing a certain action being performed in a video. The f-HOL descriptor possesses an immense competitive advantage, not only of being quite robust to small geometric transformations where the small translation and rotations make no large fluctuations in histogram values, but also of not being very sensitive under varying illumination conditions. The extracted features are then fed into a discriminative conditional model based on Latent-Dynamic Conditional random fields (LDCRFs) to learn to recognize actions from video frames. When tested on the benchmark Weizmann dataset, the proposed framework substantially supersedes most existing state-of-the-art approaches, achieving an overall recognition rate of 98.2%. Furthermore, due to its low computational demands, the framework is properly amenable for integration into real-time applications.
Read full abstract