Abstract
Human action recognition is an important branch of computer vision science. It is a challenging task based on skeletal data because of joints’ complex spatiotemporal information. In this work, we propose a method for action recognition, which consists of three parts: view-independent representation, frame interpolation, and combined model. First, the action sequence becomes view-independent representations independent of the view. Second, when judgment conditions are met, differentiated frame interpolations are used to expand the temporal dimensional information. Then, a combined model is adopted to extract these representation features and classify actions. Experimental results on two multi-view benchmark datasets Northwestern-UCLA and NTU RGB+D demonstrate the effectiveness of our complete method. Although using only one type of action feature and a simple architecture combined model, our complete method still outperforms most of the referential state-of-the-art methods and has strong robustness.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have