Abstract

This paper focuses on human action recognition in video sequences. A method based on optical flow estimation is presented, where critical points of this flow field are extracted. Multi-scale trajectories are generated from those points and are characterized in the frequency domain. Finally, a sequence is described by fusing this frequency information with motion orientation and shape information. This method has been tested on video datasets with recognition rates among the highest in the state of the art. Contrary to recent dense sampling strategies, the proposed method only requires critical points of motion flow field, thus permitting a lower computational cost and a better sequence description. A cross-dataset generalization is performed to illustrate the robustness of the method to recognition dataset biases. Results, comparisons and prospects on complex action recognition datasets are finally discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call