Abstract

Nowadays, deep learning methods have achieved state-of-the-art results in human action recognition. These methods process a full video sequence to recognize an action, which is unnecessary because many frames are similar. Recently, keyframe-based methods are proposed to overcome this issue. Though keyframe based methods have shown competitive performance in action recognition, both methods still process all the required frames of a video clip and average the results of individual clips/frames to recognize the action of the video. We argue that by simply using the average of the results of the video clips, deep models are not using the motion information of the video and thus leads to an inaccurate recognition of the action. To cope with the aforementioned issue, we propose a new online temporal classification model (OTCM) that classifies an action from a video in an online fashion and addresses the issue of averaging by making decision of each frame of a video sequence. As well, we propose a new action inference graph (AIG) that enables early recognition. Hence, the proposed model can recognize an action early before using all the keyframes or the whole video sequence and thus, requires less computation for recognizing human actions. Moreover, our OTCM can perform online action detection. To the best of our knowledge, this is the first time that the OTCM model along with the AIG is proposed. The experimental results of the benchmark datasets show that the proposed OTCM model has achieved and set a new record of the SOTA results, in particular, without using full video sequences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call