Abstract

Real-time indoor scene reconstruction aims to recover the 3D geometry of an indoor scene in real time with a sensor scanning the scene. Previous works of this topic consider pure static scenes, but in this paper, we focus on more challenging cases that the scene contains dynamic objects, for example, moving people and floating curtains, which are quite common in reality and thus are eagerly required to be handled. We develop an end-to-end system using a depth sensor to scan a scene on the fly. By proposing a Sigmoid-based Iterative Closest Point (S-ICP) method, we decouple the camera motion and the scene motion from the input sequence and segment the scene into static and dynamic parts accordingly. The static part is used to estimate the camera rigid motion, while for the dynamic part, graph node-based motion representation and model-to-depth fitting are applied to reconstruct the scene motions. With the camera and scene motions reconstructed, we further propose a novel mixed voxel allocation scheme to handle static and dynamic scene parts with different mechanisms, which helps to gradually fuse a large scene with both static and dynamic objects. Experiments show that our technique successfully fuses the geometry of both the static and dynamic objects in a scene in real time, which extends the usage of the current techniques for indoor scene reconstruction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call