Abstract

3D scene reconstruction is an important topic in computer vision. A complete scene is reconstructed from views acquired along the camera trajectory, each view containing a small part of the scene. Tracking in textureless scenes is well known to be a Gordian knot of camera tracking, and how to obtain accurate 3D models quickly is a major challenge for existing systems. For the application of robotics, we propose a robust CPU-based approach to reconstruct indoor scenes efficiently with a consumer RGB-D camera. The proposed approach bridges feature-based camera tracking and volumetric-based data integration together and has a good reconstruction performance in terms of both robustness and efficiency. The key points in our approach include: (i) a robust and fast camera tracking method combining points and edges, which improves tracking stability in textureless scenes; (ii) an efficient data fusion strategy to select camera views and integrate RGB-D images on multiple scales, which enhances the efficiency of volumetric integration; (iii) a novel RGB-D scene reconstruction system, which can be quickly implemented on a standard CPU. Experimental results demonstrate that our approach reconstructs scenes with higher robustness and efficiency compared to state-of-the-art reconstruction systems.

Highlights

  • Asus Xtion and Structure Sensor, provides an opportunity to develop indoor scene reconstruction systems conveniently.KinectFusion [1,2] is an outstanding method to generate photorealistic dense 3D models on a GPU

  • In order to further improve the performance of volumetric integration, we propose a camera view selection algorithm to prune away redundant camera views and quickly integrate the selected RGB-D images with multi-scale Truncated Signed Distance Function (TSDF)

  • Loop closure key frames are detected in camera tracking; The similarity ratio ρi,j between the ith and jth frame is measured by con-visibility content information [37] and defined as: n ρi,j = i nj where ni and n j are the number of available pixels in ith and jth depth images at the ith frame coordinate system

Read more

Summary

Introduction

KinectFusion [1,2] is an outstanding method to generate photorealistic dense 3D models on a GPU It uses a volumetric representation by the Truncated Signed Distance Function (TSDF) [3] to represent the scenes and in conjunction with fast Iterative Closest Point (ICP) [4] pose estimation to provide a real-time fused dense model. In contrast to dense ICP-based tracking methods, sparse feature-based methods extract features in RGB images and estimate the camera motion between the images They are more efficient and widely used in the sparse Simultaneous Localization and Mapping (SLAM) system. We present a new CPU-based RGB-D indoor scene reconstruction framework, which combines dense volumetric integration with a sparse feature-based tracking method and can be applied to indoor scene reconstruction with high robustness and efficiency.

Camera Tracking
Volumetric Integration
System Overview
Tracking via Points and Edges
Efficient Data Fusion
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call