PA-Net: Plane Attention Network for real-time urban scene reconstruction

Yilin Liu,Ruiqi Cui,Ke Xie,Minglun Gong,Hui Huang

doi:10.1016/j.cag.2023.07.023

Abstract

Traditional urban reconstruction methods can only output incomplete 3D models, which depict the scene regions that are visible to the moving camera. While learning-based shape reconstruction techniques make single-view 3D reconstruction possible, they are designed to handle single objects that are well-presented in the training datasets. This paper presents a novel learning-based approach for reconstructing complete 3D meshes for large-scale urban scenes in real-time. The input video sequences are fed into a localization module, which segments different objects and determines their relative positions. Each object is then reconstructed under their local coordinates to better approximate models in the training datasets. The reconstruction module is adopted from BSP-Net (Chen et al., 2020), which is capable of producing compact polygon meshes. However, major changes have been made so that unoriented objects in large-scale scenes can be reconstructed efficiently using only a small number of planes. Experimental results demonstrate that our approach can reconstruct urban scenes with buildings and vehicles using 400∼800 convex parts in 0.1∼0.5 s.

Full Text