Multi-view 3D reconstruction generally adopts the feature fusion strategy to guide the generation of 3D shape for objects with different views. Empirically, the correspondence learning of object regions across different views enables better feature fusion. However, such idea has not been fully exploited in existing methods. Furthermore, current methods fail to explore the intrinsic dependency among regions within a 3D shape, leading to a rough reconstruction result. To address the above issues, we propose a Dual-View 3D Point Cloud reconstruction architecture named DVPC, which takes two views images as inputs, and progressively generates a refined 3D point cloud. First, a point cloud generation network is assigned to generate a coarse point cloud for each input view. Second, a dual-view point clouds synthesis network is presented in DVPC. It constructs a regional attention mechanism to learn a high-quality correspondence among regions across two coarse point clouds in different views, so that our DVPC can achieve feature fusion accurately. And then it develops a point cloud deformation module to produce a relatively-precise point cloud via establishing the communication between the coarse point cloud and the fused feature. Lastly, a point-region transformer network is devised to model the dependency among regions within the relatively-precise point cloud. With the dependency, the relatively-precise point cloud is refined into a desirable 3D point cloud with rich details. Qualitative and quantitative experiments on the ShapeNet and Pix3D datasets demonstrate that the proposed DVPC outperforms the state-of-the-art methods in terms of reconstruction quality.
Read full abstract