Abstract

Human pose forecasting that aims to predict the body poses happening in the future is an important task in computer vision. However, long-term pose forecasting is particularly challenging because modeling long-range dependencies across the spatial-temporal level is hard for joint-based representation. Another challenge is uncertainty prediction since the future prediction is not a deterministic process. In this work, we present a novel <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B</b> ayesian <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</b> patial- <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</b> emporal <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">G</b> raph <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Trans</b> former (BSTG-Trans) for predicting accurate, diverse, and uncertain future poses. First, we apply a spatial-temporal graph transformer as an encoder and a temporal-spatial graph transformer as a decoder for modeling the long-range spatial-temporal dependencies across pose joints to generate the long-term future body poses. Furthermore, we propose a Bayesian sampling module for uncertainty quantization of diverse future poses. Finally, a novel uncertainty estimation metric, namely Uncertainty Absolute Error is introduced for measuring both the accuracy and uncertainty of each predicted future pose. We achieve state-of-the-art performance against other baselines on Human3.6M and HumanEva-I in terms of accuracy, diversity, and uncertainty for long-term pose forecasting. Moreover, our comprehensive ablation studies demonstrate the effectiveness and generalization of each module proposed in our BSTG-Trans. Code and models are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/stoneMo/BSTG-Trans</uri> .

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call