Abstract

In this paper, we propose a deep progressive reinforcement learning (DPRL) method for action recognition in skeleton-based videos, which aims to distil the most informative frames and discard ambiguous frames in sequences for recognizing actions. Since the choices of selecting representative frames are multitudinous for each video, we model the frame selection as a progressive process through deep reinforcement learning, during which we progressively adjust the chosen frames by taking two important factors into account: (1) the quality of the selected frames and (2) the relationship between the selected frames to the whole video. Moreover, considering the topology of human body inherently lies in a graph-based structure, where the vertices and edges represent the hinged joints and rigid bones respectively, we employ the graph-based convolutional neural network to capture the dependency between the joints for action recognition. Our approach achieves very competitive performance on three widely used benchmarks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call