Abstract

Human action recognition is widely explored as it finds varied applications including visual navigation, surveillance, video indexing, biometrics, human–computer interaction, ambient assisted living, etc. This paper aims to design and analyze the performance of Spatial and Temporal CNN streams for action recognition from videos. An action video is fragmented into a predefined number of segments called snippets. For each segment, atomic poses portrayed by the individual is effectively captured by the representative frame and dynamics of the action is well described by the dynamic image. The representative frames and dynamic images are separately trained by Convolutional Neural Network for further analysis. The attained results on KTH, Weizmann UCF Sports and UCF101 datasets ascertain the efficiency of the proposed architecture for action recognition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.