Abstract

We propose a novel camera calibration method for a room-scale multi-view imaging system. Our key idea is to leverage our articulated body movements as a calibration target. We show that a freely moving person provides trajectories of a set of oriented points ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g</i> ., neck joint with spine direction) from which we can estimate the locations and poses of all cameras observing them. The method only requires the cameras to be synced and that 2D human poses are estimated in each view sequence. By elevating these 2D poses to 3D which directly provides a set of oriented 3D joints, we compute the extrinsic parameters of all cameras with a linear algorithm. We also show that this enables self-supervision of the 3D joint estimator for refinement, and the iteration of the two leads to accurate camera extrinsics and 3D pose estimates up to scale. Extensive experiments using synthetic and real data demonstrate the effectiveness and flexibility of the method. The method will serve as a useful tool to expand the utility of multi-view vision systems as it eliminates the need for cumbersome on-site calibration procedures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call