Abstract

3D object reconstruction from single view image is a challenge task. Due to the fact that the information contained in one isolated image is not sufficient for reasonable 3D shape reconstruction, the existing methods on single-view 3D reconstruction always lack marginal voxels and miss obvious semantic information. To tackle these problem, we propose Semantic Autoencoder-Attention Network (SAAN) for single view 3D reconstruction. Distinct from the common auto-encoder (AE) structure, the proposed network consists of two successive parts. The first part is made of two parallel branches, 3D-autoencoder (3DAE) and Attention Network. 3DAE completes the general shape reconstruction by an AE model, and Attention Network supplements the missing details by a 3D reconstruction attention network. In the other part, we compare the semantic information between the input images and the generated voxel from the autoencoder-attention network (AAN) with Res101 network and 3D-Res101 network. The results of the comparison are fed back to the AAN for parameter adjustment iteratively until to generate the best semantically characterized voxel representations. In the experiments, we verify the feasibility of our network on the ShapeNet dataset. By comparing with the state-of-art methods, the proposed SAAN can produce more precise 3D object models in terms of both qualitative and quantitative evaluation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call