Abstract

In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction. The subject’s motion is efficiently captured by applying a template-based fitting technique on the extracted optical data. Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D). The FCN model outperforms its competitors on the DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the next best method by in total 3D PCK accuracy.

Highlights

  • IntroductionKnown as motion capture (MoCap), has been studied for decades and is still a very active and challenging research topic

  • Human pose tracking, known as motion capture (MoCap), has been studied for decades and is still a very active and challenging research topic

  • DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the best method by 4.5% in total 3D PCK accuracy

Read more

Summary

Introduction

Known as motion capture (MoCap), has been studied for decades and is still a very active and challenging research topic. MoCap technique, called Optical Motion Capture [1], constitutes the gold-standard for accurate and robust motion capture [2]. Optical MoCap solutions [3,4,5] employ multiple optical sensors and passive or active markers (passive markers are coated with retro-reflective material to reflect light, while active markers are powered to emit it; for passive marker-based MoCap systems, IR emitters are used to cast IR light on the markers) placed on the body of the subject to be captured. The 3D positions of the markers are extracted by intersecting the projections of two or more spatio-temporally aligned optical. These solutions precisely capture the body movements, i.e., the body joint 3D positions and orientations per frame in high frequency (ranging from 100 to 480 Hz)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call