Human Pose Estimation in Video via Structured Space Learning and Halfway Temporal Evaluation

Shiguang Liu,Guoguang Hua,Yang Li

doi:10.1109/tcsvt.2018.2858828

Abstract

Human pose estimation from image or video is a basic issue in computer graphics and computer vision. The challenge of human pose estimation in video lies in the temporal coherency issue. The temporal consistency in video is the contents’ similarity shown in the video frames. In video, temporal consistency maintenance of human pose estimation is to obtain better long-term consistency. Great major methods for the long-term consistency are using the whole video optimization method, which makes very large computation and the absence of consistency before and after the articulated limbs. In this paper, a novel method for the maintenance of temporal consistency is proposed. We maintain the temporal consistency of the video by the structured space learning and halfway temporal evaluation methods. We adopt a three-stage multi-feature deep convolution network framework to generate the initial posture joints position data, and a long-term temporal coherence is propagated to the overall video at each stage. The long-term consistency is more appealing since it produces stable results over larger periods of time. Our method can achieve good temporal consistency and get accurate and stable human pose estimation results. Various experimental results demonstrated the superiority of our method.

Full Text