Abstract

Three dimensional reconstruction of a rigid object from monocular video sequences is addressed. Initially object pose is estimated in each image by locating similar (unknown) textures assuming flat depth maps for all input images. Shape-from-silhouette Szeliski (1993) is then applied to make a 3-D model (volume), which is used for a new round of pose estimation, this time by a model-based method giving better estimates. Before repeating this process by building a new volume, pose estimates are adjusted to reduce error by maximizing a quality measure for shape-from-silhouette volume reconstruction. The volume feedback is terminated when pose estimates do not change much as compared to those produced by previous iteration. The final output is a pose index (the last set of pose estimates) and a volume. Good performance of the system is shown by several experiments. No model is assumed for the object. Feature points are neither detected nor tracked: no problematic feature matching or correspondence. The high-level pose index generated for input images can be used for content-based retrieval. Our method can be also applied to 3-D object tracking in video.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.