Abstract

Action detection and recognition are popular subjects of research in the field of computer vision. The task of action detection can be regarded as the sum of action location and recognition. Action features described by using information concerning the human skeleton have the advantages of robustness against external factors and requiring a small amount of calculation. This study proposes a skeleton‐based action analysis model based on a recurrent neural network framework. The model learns action features by modelling static and dynamic features of skeleton joints and the importance of different video frames by introducing an attention module. For action location, conditional random field loss function is introduced to establish the context dependency of output labels. In the aspect of action recognition, the hierarchical training mechanism with triple loss models action features at coarse‐grained and fine‐grained levels. The authors’ proposed method delivers state‐of‐the‐art results on action location and recognition tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.