Abstract

Neural rendering approaches enable photo-realistic rendering on novel view synthesis tasks while their per-scene optimization remains an issue for scalability. Recent methods introduce novel neural radiance field (NeRF) frameworks that generalize to unseen scenes on-the-fly by combining multi-view stereo with differentiable volume rendering. These generalizable NeRF methods synthesize the colors of 3D ray points by learning the consistency of image features projected from given nearby views. Since the consistency is computed on the 2D projected image space, it is vulnerable to occlusion and local shape variation by viewing direction. To solve this problem, we present dense depth-guided generalizable NeRF that leverages the depth as the signed distance between the ray point and the object surface of the scene. We first generate the dense depth maps from sparse 3D points of structure from motion (SfM) which is an inevitable step to obtain camera poses. Next, the dense depth maps are exploited as complementary features invariant to the sparsity of nearby views and mask for occlusion handling. Experiments demonstrate that our approach outperforms existing generalizable NeRF methods for widely used real and synthetic datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call