Abstract

A system for 3-D reconstruction of a rigid object from monocular video sequences is introduced. Initially an object pose is estimated in each image by locating similar (unknown) texture assuming flat depth map for all images. Shape-from-silhouette as stated in R. Szeliski (1993) is then applied to construct a 3-D model which is used to obtain better pose estimates using a model-based method. Before repeating the process by building a new 3-D model, pose estimates are adjusted to reduce error by maximizing a quality measure for shape-from-silhouette volume reconstruction. Translation of the object in the input sequence is compensated in two stages. The volume feedback is terminated when the updates in pose estimates become small. The final output is a pose index (the last set of pose estimates) and a 3-D model of the object. Good performance of the system is shown by experiments on a real video sequence of a human head. Our method has the following advantages: (1) No model is assumed for the object. (2) Feature points are neither detected nor tracked, thus no problematic feature matching or lengthy point tracking are required. (3) The method generates a high level pose index for the input images, these can be used for content-based retrieval. Our method can also be applied to 3-D object tracking in video.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.