Abstract
Acquiring a three-dimensional perception of an object or a scene, from regular (single-camera and 2-D) video, is a trivial task for humans. The automatic implementation of such a task has been, and still is, one of the major problems of computer vision. The new approach introduced in this thesis focuses on volume reconstruction of an object from image sequences taken by a single camera. One of the numerous applications of this approach is 3-D object tracking in video. This can be used in very low bit-rate customized video transmission schemes. A multi-objective pose estimation method is introduced that computes object relative pose between two input frames. One advantages of this method is that it does not use any feature point, thus it does not suffer from problems with feature point detection and tracking. Also, the method does not assume any model for the object at the outset, hence it can be applied to an arbitrary object. The method, however, requires a depth-map, which is not readily available from an image sequence. To overcome this requirement, an iterative scheme is employed. The first round of pose estimation between consequent frames is performed, assuming fiat depth-maps. Pose estimates are then adjusted to reduce the error by maximizing a novel quality factor for shape-from-silhouette volume reconstruction. Shape-from-silhouette is applied to construct a 3-D model (volume), which provides depth-maps for the next round of pose estimation. The feedback loop is terminated when pose estimates do not change much, as compared to those produced by the previous iteration. Based on our theoretical study of the proposed system, a test of convergence to a given set of poses is devised. To handle input sequences with unknown frame order, the input sequence undergoes a pre-processing stage, in which the frames of the sequence are re-ordered to obtain the most accurate pose estimation. A theoretical validity criterion for volume reconstruction by shape-from-silhouette is established. This criterion is used to produce a volume reconstruction quality factor, which plays an important role in pose estimation adjustment. The reliable performance of our system is proved via several simulations carried on both synthetic and real image sequences. Effects of pose sampling rate, distribution of pose samples, and error in input pose on volume reconstruction quality by shape-from-silhouette are studied. It is shown that high levels of pose error cannot be compensated by increase in pose sampling rate, and that volume reconstruction at high pose sampling rates is more sensitive to pose error.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.