The realm of 3D computer vision and graphics has experienced exponential growth recently, enabling the creation of realistic virtual environments and digital representations of real-world objects. Central to this progression are 3D reconstruction methods that facilitate the virtualization of shape, color, and surface details of real objects. Current methods predominantly employ neural scene representations, which despite their efficacy, grapple with limitations such as necessitating a high number of captured images and the complexity of transforming these representations into explicit geometric forms.An alternative strategy that has gained traction is the deployment of methods such as physically-based differentiable rendering (PBDR) and inverse rendering. These approaches require fewer viewpoints, yield explicit format results, and ensure a smoother transition to other representation methods. To meaningfully assess the performance of different 3D reconstruction methods, it is imperative to utilize benchmark scenes for comparison.Despite the existence of standard objects and scenes within the literature, there is a noticeable deficiency in real-world benchmark data that concurrently captures camera, illumination, and scene parameters — all critical to high-fidelity 3D reconstructions using PBDR and inverse rendering-based methods. In this research, we introduce a methodology for capturing real-world scenes as virtual scenes, integrating illumination parameters alongside camera and scene parameters to enhance the veracity of virtual representations. In addition, we introduce a set of ten real-world scenes, along with their virtual counterparts, designed as benchmarks. These benchmarks encompass a fundamental variety of geometric constructs, including convex, concave, plain, and mixed surfaces. Additionally, we demonstrate the 3D reconstruction results of state-of-the-art 3D reconstruction methods employing PBDR in real-world scenes, using both established methodologies and our proposed one.
Read full abstract