Abstract

Getting representations of multiple objects or scenes is a raising research topic in Machine Learning (ML) community. Here, we propose a multi-scene representation model that can learn the representation of complex scenes and reconstruct them in high resolution given novel viewing directions. Our method represents a single scene with fully-connected layers. Each set of fully-connected layers are controlled by hyper-networks for multiple scenes modeling. For each scene, we take 3D coordinates (x, y, z) and 2D view-point orientations (θ, ɸ) as inputs. A set of fully-connected layers output volume density and RGB values at given 3D spatial positions. Then, we render the output volume density and RGB values along the camera rays into images using volume density rendering techniques. During training process, we optimize a continuous volume scene function with a small amount of input viewing directions. By designing versatile embedding module and multi-scene representation networks, our model can render photographic images with novel viewing directions for different complex scenes. Experiment results demonstrate the neural rendering and multi-scene representation abilities of our model. Several thorough experiments show that our method outperforms previous model on both reconstruction precision and scenes generation ability from novel viewing directions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call