Metrically Accurate 3D Human Avatars from Silhouette Images
Metrically Accurate 3D Human Avatars from Silhouette Images
- Conference Article
306
- 10.1109/iccv.2011.6126465
- Nov 1, 2011
The 3D shape of the human body is useful for applications in fitness, games and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We propose a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.
- Book Chapter
4
- 10.1007/978-1-4471-4640-7_6
- Jan 1, 2013
The 3D shape of the human body is useful for applications in fitness, games, and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. Although there has been a great deal of interest recently in the use of active depth sensing cameras, such as the Microsoft Kinect, for human pose tracking, little has been said about the related problem of human shape estimation. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape, while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We use a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.
- Conference Article
28
- 10.1109/3dv.2019.00039
- Sep 1, 2019
We propose a novel computer vision system for reconstructing 3D body shapes from 2D images with the goal of producing highly accurate anthropomorphic measurements from a pair of images. We adopt a supervised learning approach that maps silhouette images to 3D body shapes via a convolutional neural network (CNN). We propose three key improvements over previous approaches: (1) Large-scale realistic synthetic data generation, including more realistic variations in segmentation noise and camera viewpoints. (2) A multi-task learning (MTL) approach to predicting multiple outputs such as shape, 3D joint locations, pose angles, and body volume. (3) A new network architecture that additionally takes known body measurements (e.g., height) and per-pixel segmentation confidence as input. Ablation studies show the improvement in accuracy due to the various components of our system. Results demonstrate that our system produces state-of-the-art results on body circumference errors. We also analyze the repeatability of our system in the presence of realistic camera, background, and pose variations. Our system achieves a vertex standard deviation of ~3mm on the [36] CAESAR dataset.
- Conference Article
2
- 10.1109/im.2003.1240244
- Oct 27, 2003
We present a range image refinement technique for generating accurate 3D computer models of real objects. Range images obtained from a stereo-vision system typically experience geometric distortions on reconstructed 3D surfaces due to the inherent stereo matching problems such as occlusions or mismatchings. We introduce a range image refinement technique to correct such erroneous ranges by employing epipolar geometry of a multiview modelling system and the visual hull of an object. After registering multiple range images into a common coordinate system, we first determine if a 3D point in a range image is erroneous, by measuring registration of the point with its correspondences in other range images. The correspondences are determined on 3D contours which are inverse-projections of epipolar lines in other 2D silhouette images. Then the range of the point is refined onto the object's surface, if it is erroneous. We employ two techniques to search the correspondences fast. In case that there is no correspondence for an erroneous point, we refine the point onto the visual hull of the object. We show that refined range images yield better geometric structures in reconstructed 3D models.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.