Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction
Photometric stereo (PS) is an established technique for high-detail reconstruction of 3D geometry and appearance. To correct for surface integration errors, PS is often combined with multiview stereo (MVS). With dynamic objects, PS reconstruction also faces the problem of computing optical flow (OF) for image alignment under rapid changes in illumination. Current PS methods typically compute optical flow and MVS as independent stages, each one with its own limitations and errors introduced by early regularization. In contrast, scene flow methods estimate geometry and motion, but lack the fine detail from PS. This paper proposes photogeometric scene flow (PGSF) for high-quality dynamic 3D reconstruction. PGSF performs PS, OF, and MVS simultaneously. It is based on two key observations: (i) while image alignment improves PS, PS allows for surfaces to be relit to improve alignment, (ii) PS provides surface gradients that render the smoothness term in MVS unnecessary, leading to truly data-driven, continuous depth estimates. This synergy is demonstrated in the quality of the resulting RGB appearance, 3D geometry, and 3D motion.
- Conference Article
7
- 10.1109/3dimpvt.2011.12
- May 1, 2011
Multi-view photometric stereo is well established for the shape recovery of static objects. However, it is difficult to align motion images under varying illumination so as to perform photometric stereo reconstruction for dynamic objects. To tackle this issue, this paper presents an optical flow estimation approach which works under periodically varying illuminations, and in cooperation with photometric stereo, enables high-quality 3D reconstruction of dynamic objects. Firstly, multi-view images of the moving object are captured under periodically varying illumination by the multi-camera multi-light system. Then, the optical flow is estimated to synthesize images under different illuminations for each viewpoint. Finally, the multi-view photometric stereo technique is employed to get a high accurate 3D model for each time instant. Experimental results on motion actors demonstrate that temporal successive images under varying illuminations are effectively registered, permitting accurate photometric reconstruction for moving objects.
- Conference Article
57
- 10.1109/iccv.2019.00114
- Oct 1, 2019
Highly accurate 3D volumetric reconstruction is still an open research topic where the main difficulty is usually related to merging some rough estimations with high frequency details. One of the most promising methods is the fusion between multi-view stereo and photometric stereo images. Beside the intrinsic difficulties that multi-view stereo and photometric stereo in order to work reliably, supplementary problems arise when considered together. In this work, we present a volumetric approach to the multi-view photometric stereo problem. The key point of our method is the signed distance field parameterisation and its relation to the surface normal. This is exploited in order to obtain a linear partial differential equation which is solved in a variational framework, that combines multiple images from multiple points of view in a single system. In addition, the volumetric approach is naturally implemented on an octree, which allows for fast ray-tracing that reliably alleviates occlusions and cast shadows. Our approach is evaluated on synthetic and real data-sets and achieves state-of-the-art results.
- Conference Article
20
- 10.1109/wacv56688.2023.00314
- Jan 1, 2023
Multi-view photometric stereo (MVPS) is a preferred method for detailed and precise 3D acquisition of an object from images. Although popular methods for MVPS can provide outstanding results, they are often complex to execute and limited to isotropic material objects. To address such limitations, we present a simple, practical approach to MVPS, which works well for isotropic as well as other object material types such as anisotropic and glossy. The proposed approach in this paper exploits the benefit of uncertainty modeling in a deep neural network for a reliable fusion of photometric stereo (PS) and multi-view stereo (MVS) network predictions. Yet, contrary to the recently proposed state-of-the-art, we introduce neural volume rendering methodology for a trustworthy fusion of MVS and PS measurements. The advantage of introducing neural volume rendering is that it helps in the reliable modeling of objects with diverse material types, where existing MVS methods, PS methods, or both may fail. Furthermore, it allows us to work on neural 3D shape representation, which has recently shown outstanding results for many geometric processing tasks. Our suggested new loss function aims to fit the zero level set of the implicit neural function using the most certain MVS and PS network predictions coupled with weighted neural volume rendering cost. The proposed approach shows state-of-the-art results when tested extensively on several benchmark datasets.
- Conference Article
32
- 10.1109/cvpr52688.2022.01227
- Jun 1, 2022
This paper presents a simple and effective solution to the longstanding classical multi-view photometric stereo (MVPS) problem. It is well-known that photometric stereo (PS) is excellent at recovering high-frequency surface details, whereas multi-view stereo (MVS) can help remove the low-frequency distortion due to PS and retain the global geometry of the shape. This paper proposes an approach that can effectively utilize such complementary strengths of PS and MVS. Our key idea is to combine them suitably while considering the per-pixel uncertainty of their estimates. To this end, we estimate per-pixel surface normals and depth using an uncertainty-aware deep-PS network and deep-MVS network, respectively. Uncertainty modeling helps select reliable surface normal and depth estimates at each pixel which then act as a true representative of the dense surface geometry. At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure. For dense, detailed, and precise inference of the object's surface profile, we propose to learn the implicit neural shape representation via a multilayer perceptron (MLP). Our approach encourages the MLP to converge to a natural zero-level set surface using the confident prediction from deep-PS and deep-MVS networks, providing superior dense surface reconstruction. Extensive experiments on the DiLiGenT-MV benchmark dataset show that our method provides high-quality shape recovery with a much lower memory footprint while outperforming almost all of the existing approaches.
- Conference Article
31
- 10.1109/wacv51458.2022.00402
- Jan 1, 2022
We present a modern solution to the multi-view photometric stereo problem (MVPS). Our work suitably exploits the image formation model in a MVPS experimental setup to recover the dense 3D reconstruction of an object from images. We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry. Contrary to the previous multi-staged framework to MVPS, where the position, iso-depth contours, or orientation measurements are estimated independently and then fused later, our method is simple to implement and realize. Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network. We render the MVPS images by considering the object's surface normals for each 3D sample point along the viewing direction rather than explicitly using the density gradient in the volume space via 3D occupancy information. We optimize the proposed neural radiance field representation for the MVPS setup efficiently using a fully connected deep network to recover the 3D geometry of an object. Extensive evaluation on the DiLiGenT-MV benchmark dataset shows that our method performs better than the approaches that perform only PS or only multi-view stereo (MVS) and provides comparable results against the state-of-the-art multistage fusion methods.
- Research Article
4
- 10.1007/s00371-017-1430-5
- Aug 28, 2017
- The Visual Computer
This paper presents a hybrid approach for 3D reconstruction by fusing photometric stereo and multi-view stereo. The 3D surface is obtained by capturing a set of images taken from different viewpoints under time-varying illuminations. Key factors in the reconstruction process are surface normals that are obtained from photometric stereo. The surface is initialized by integrating the normals and then refined by performing iterative deformations on the initial surface and thereby optimizing image and normal consistency in multiple views. Benefiting from the employment of the deformation approach, we are able to perform image and normal consistency optimization without using matching windows. Instead, always the complete surface is back-projected. This makes the proposed approach much simpler and more robust compared to window-based approaches, which typically require global optimization with constraints on neighboring windows. Experiments on real-world data and ground-truth data show that for diffuse midsized objects without large depth discontinuities our approach improves the accuracy of the reconstructions compared to exiting approaches.
- Conference Article
1
- 10.1049/cp:19950681
- Jan 1, 1995
The shading information in two or more images of a surface, obtained under different illuminations from a single camera, can be used for shape estimation in the process known as photometric stereo (PS). From the observation of the fact that pairs of photometric-stereo images viewed under a stereoscope produce an impression of depth which can be almost as striking as that produced by stereoscopic pairs, we have been led to the study of photometric stereo as a geometric image matching process. We have analysed the possibility of extracting shape information from the optical flow which results from the change of illumination in PS images. Two main results have arisen from our analysis. We have found that, (i) under quite general conditions, the photometric-stereo optical flow (PS flow) can be employed for the estimation of surface curvature; and (ii) under the assumption that a linear approximation to the reflectance map is appropriate, estimates of the relative-depth function can also be obtained from such flow. These results are discussed.
- Research Article
10
- 10.1007/s00138-013-0507-z
- Apr 14, 2013
- Machine Vision and Applications
Photometric stereo surface reconstruction requires each input image to be associated with a particular 3D illumination vector. This signifies that the subject should be illuminated in turn by various directional illumination sources. In real life, this directionality may be reduced by ambient illumination, which is typically present as a diffuse component of the incident light. This work assesses the photometric stereo reconstruction quality for various ratios of ambient to directional illuminance and provides a reference for the robustness of photometric stereo with respect to that illuminance ratio. In our analysis, we focus on the face reconstruction application of photometric stereo, as faces are convex objects with rich surface variation, thus providing a suitable platform for photometric stereo reconstruction quality evaluation. Results demonstrate that photometric stereo renders realistic reconstructions of the given surface for ambient illuminance as high as nine times the illuminance of the directional light component.
- Research Article
9
- 10.1007/s00138-014-0609-2
- Apr 9, 2014
- Machine Vision and Applications
Within the context of photometric stereo reconstruction, flatfielding may be used to compensate for the effect of the inverse-square law of light propagation on the pixel brightness. This would require capturing a set of reference images at an off-line imaging session, which employs a calibrating device that should be captured under the exact conditions as the main session. Similarly, the illumination vectors, on which photometric stereo relies, are typically precomputed based on another dedicated calibration session. In practice, implementing such off-line sessions is inconvenient and often infeasible. This work aims at enabling accurate photometric stereo reconstruction for the case of non-interactive on-line capturing of human faces. We propose unsupervised methodologies, which extract all information that is required for accurate face reconstruction from the images of interest themselves. Specifically, we propose an uncalibrated flatfielding and an uncalibrated illumination vector estimation methodology, and we assess their effect on photometric stereo face reconstruction. Results demonstrate that incorporating our methodologies into the photometric stereo framework halves the reconstruction error, while eliminating the need of off-line calibration.
- Conference Article
180
- 10.1109/iccv.2003.1238405
- Jan 1, 2003
We present an algorithm for computing optical flow, shape, motion, lighting, and albedo from an image sequence of a rigidly-moving Lambertian object under distant illumination. The problem is formulated in a manner that subsumes structure from motion, multiview stereo, and photometric stereo as special cases. The algorithm utilizes both spatial and temporal intensity variation as cues: the former constrains flow and the latter constrains surface orientation; combining both cues enables dense reconstruction of both textured and textureless surfaces. The algorithm works by iteratively estimating affine camera parameters, illumination, shape, and albedo in an alternating fashion. Results are demonstrated on videos of hand-held objects moving in front of a fixed light and camera.
- Conference Article
54
- 10.1109/iccv.2013.148
- Dec 1, 2013
We propose a method for accurate 3D shape reconstruction using uncalibrated multiview photometric stereo. A coarse mesh reconstructed using multiview stereo is first parameterized using a planar mesh parameterization technique. Subsequently, multiview photometric stereo is performed in the 2D parameter domain of the mesh, where all geometric and photometric cues from multiple images can be treated uniformly. Unlike traditional methods, there is no need for merging view-dependent surface normal maps. Our key contribution is a new photometric stereo based mesh refinement technique that can efficiently reconstruct meshes with extremely fine geometric details by directly estimating a displacement texture map in the 2D parameter domain. We demonstrate that intricate surface geometry can be reconstructed using several challenging datasets containing surfaces with specular reflections, multiple albedos and complex topologies.
- Conference Article
3
- 10.1109/uic-atc.2017.8397465
- Aug 1, 2017
Surface normal maps created by photometric stereo allow for high-quality rendering from certain viewpoints, even when the resolution of original images is low. However, the lack of constraints between multiple disconnected patches, the frequent presence of low-frequency distortion, and some actual conditions often lead to a bias during the photometric stereo reconstruction using direct integration. In this paper, we therefore present a hybrid method, which exploits the depth information that the encoded structured light system produces, in order to correct the photometric stereo bias. On the other hand, this method retains the high-precision normal information. Our experimental results show that the proposed method can not only recover high-frequency details but also avoid, or at least reduce, the low-frequency bias. In particular, the error that our method generates in the underwater environment is tolerant, even in the case that high turbidity values occur.
- Conference Article
- 10.1109/icma.2011.5985749
- Aug 1, 2011
In this paper, an effective technique is presented to reconstruct an accurate and reliable 3D surface model from multi-view stereo. Different from classical means, we investigate to integrate stereo and motion analysis approach in which optical flow-scene flow framework is involved. In this scheme, reconstruction is decoupled into two stages. Firstly, depth of feature points is recovered and in turn is used for building an intermediate polygonal mesh; Secondly, projection feedback on comparison views, which is generated on assumption of the established coarse mesh model, is carefully introduced to deform the primitive mesh model so as to improve its quality dramatically. The discrepancy of observation on comparison views and the corresponding predictive feedback is quantitatively evaluated by optical flow field and is employed to derive the corresponding scene flow vector field subsequently, which is then used for surface deformation. As optical flow vector field estimation outperforms traditional dense disparity for its inherent advantage of being robust to illumination change and being optimized and smoothed in global sense, the deformed surface can be improved in accuracy, which is firmly validated by encouraging experimental results.
- Research Article
318
- 10.1109/tpami.2007.70820
- Mar 1, 2008
- IEEE Transactions on Pattern Analysis and Machine Intelligence
This paper addresses the problem of obtaining complete, detailed reconstructions of textureless shiny objects. We present an algorithm which uses silhouettes of the object, as well as images obtained under changing illumination conditions. In contrast with previous photometric stereo techniques, ours is not limited to a single viewpoint but produces accurate reconstructions in full 3D. A number of images of the object are obtained from multiple viewpoints, under varying lighting conditions. Starting from the silhouettes, the algorithm recovers camera motion and constructs the object's visual hull. This is then used to recover the illumination and initialise a multi-view photometric stereo scheme to obtain a closed surface reconstruction. There are two main contributions in this paper: Firstly we describe a robust technique to estimate light directions and intensities and secondly, we introduce a novel formulation of photometric stereo which combines multiple viewpoints and hence allows closed surface reconstructions. The algorithm has been implemented as a practical model acquisition system. Here, a quantitative evaluation of the algorithm on synthetic data is presented together with complete reconstructions of challenging real objects. Finally, we show experimentally how even in the case of highly textured objects, this technique can greatly improve on correspondence-based multi-view stereo results.
- Research Article
2
- 10.9708/jksci.2011.16.4.073
- Apr 30, 2011
- Journal of the Korea Society of Computer and Information
본 논문에서는 모델 히스토그램 개수를 적응적으로 조절하는 블록기반의 배경 모델링 방법을 제안한다. 기존의 블록 기반의 배경 모델링 방법은 각 블록에 대한 모델 히스토그램의 개수를 고정한다. 따라서 조명변화와 움직이는 객체에 대해 오검출이 발생하는 문제가 있고 움직임이 없는 객체에 대해서는 검출이 되지 않는 문제가 있다. 또한 입력영상의 종류마다 달라질 수 있는 최적의 모델 히스토그램의 개수를 수동적으로 찾아야 하는 문제가 있다. 본 논문에서는 실험을 통해 엘리베이터 내에서 조명변화가 있고 객체가 움직이는 상황과 조명변화가 없고 객체가 정지해 있는 상황에 대해 기존의 방법과 성능을 비교하여 제안한 알고리즘의 효용성을 입증한다. In this paper, an improved block-based background modeling technique using adaptive parameter estimation that judiciously adjusts the number of model histograms at each frame sequence is proposed. The conventional block-based background modeling method has a fixed number of background model histograms, resulting to false negatives when the image sequence has either rapid illumination changes or swiftly moving objects, and to false positives with motionless objects. In addition, the number of optimal model histogram that changes each type of input image must have found manually. We demonstrate the proposed method is promising through representative performance evaluations including the background modeling in an elevator environment that may have situations with rapid illumination changes, moving objects, and motionless objects.