Abstract
Human shape and pose estimation is a popular but challenging problem, especially when asked to capture the body, hands, feet and face jointly for multiple persons with close interaction. Existing methods can only have a total motion capture of a single person or multiple persons without close interaction. In this paper, we present a fully automatic and effective method to capture full-body human performance including body poses, face poses, hand gestures, and feet orientations for closely interacting multiple persons. We predict 2D keypoints corresponding to the poses of body, face, hands and feet for each person, and associate the same person in multi-view videos by computing personalized appearance descriptors to reduce ambiguities and uncertainties. To deal with occlusions and obtain temporally coherent human shapes, we estimate shape and pose for each person with the spatio-temporal tracking and constraints. Experimental results demonstrate that our method achieves better performance than state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.