Abstract
We address how human pose in 3D can be tracked from a monocular video using a probabilistic inference method. Human body is modeled as a number of cylinders in space, each with an appearance facet as well as a pose facet. The appearance facets are acquired in a learning phase from some beginning frames of the input video. On this the visual hull description of the target human subject constructed from multiple images is found to be instrumental. In the operation phase, the 3D pose of the target subject in the subsequent frames of the input video is tracked. A bottom-up framework is used, which for any current image frame extracts firstly the tentative candidates of each body part in the image space. The human model, with the appearance facets already learned, and with the pose entries initialized with those for the previous image frame, is then brought in under a belief propagation algorithm, to establish correlation with the above 2D body part candidates while enforcing the proper articulation between the body parts, thereby determining the 3D pose of the human body in the current frame. The tracking performance on a number of monocular videos is shown.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.