Abstract

Though continuous advances in the field of human pose estimation, it remains a challenge to retrieve high-quality recordings from real-life human motion using commodity hardware. Therefore, this work focuses on predicting and improving estimates for human motion with the aim of achieving production quality for skinned mesh animations by off-the-shelf webcams. We take advantage of recent findings in the field by employing a recurrent neural network architecture to (1) predict and (2) denoise human motion, with the intention of bridging the gap between cheap recording methods and high-quality recording. First, we propose an LSTM to predict short-term human motion, which achieves competitive results to state-of-the-art methods. Then, we adapt this model architecture and train it to clean up noisy human motion from two 3D low-quality input sources, and hence mimic a real-world scenario of recording human motion which yields noisy estimates. Experiments on simulated data show that the model is capable of significantly reducing noise, and it opens the way for future work to test the model on annotated data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call