Abstract

The presented work is guided with the motivation of understanding the deep-learning based 3D reconstruction process for applications in aerial close-range photogrammetry. Given the highly dynamic nature of such a setting, the accuracy and understanding of traditional reconstruction methods as well as the generalization capabilities of deep learning-based methods is required. However, the state-of-the-art methods are typically inadequate. The presented work demonstrates a two-part machine learning-based approach that rely on autoencoder-like models. The first is a Sparse AutoEncoder (SAE) that takes a single image as input and reconstructs a 3D voxel grid. The input images are then sorted based on the reconstruction quality of the SAE output. The second is a Variational AutoEncoder (VAE) that processes multiple images sampled from the ordered set to generate an enhanced 3D voxel grid. The work highlights a novel approach to 3D model reconstruction and presents insights to the process of 3D reconstruction from single image inputs. The autoencoders are trained on a dataset comprised of multiple objects with images captured from different zenith and azimuth angles, simulating an aerial vehicle viewpoint. We show the efficacy of the proposed approach by reconstructing a 3D voxel grid on a ModelNet40 dataset class.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call