Abstract

Action and gesture recognition is essential in computer vision because of their multiple and potential applications. Nowadays, in the literature, dramatic advances have been reported regarding recognizing gestures and actions under uncontrolled scenarios with significant appearance and motion variations. Nevertheless, much of these approaches still require manual segmentation of temporal action boundaries and complete processing of whole sequences to obtain a prediction. This work introduces a novel motion description that can recognize actions and gestures over partial sequences. The approach starts by representing video sequences as a set of key-point trajectories. Such trajectories are then hierarchically represented from a local and regional perspective, following a statistical counting process. Firstly, each trajectory is defined as a binary occurrence pattern that allows for standing out critical motions by neighborhood densities from a local perspective. Such occurrence patterns are involved in a regional bag-of-words representation of actions. Both representations could be obtained for any interval inside the video, achieving a partial recognition of motion, and regional representation is mapped to a support vector machine to obtain a prediction. The proposed approach was evaluated on academic action recognition datasets and a large gesture dataset used for sign recognition. Regarding partial video sequence recognition, the proposed approach achieves an accuracy rate of 63% using only 20% of frames. The proposed strategy achieved a very compact description, with only 400 scalar values, which ideal for online applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call