Abstract

The real-world objects in our physical environment present with diverse information and multimodal features, including 3D shapes (geometry and topology) and 2D images (appearance and semantics), etc. How to effectively represent and correlate them in a unified way is still very challenging due to different modalities and representations. In this paper, we present a novel method to learn a unified and effective latent space for a joint representation and simultaneous generation of 3D point clouds and 2D images. We propose a new geometry-aware autoencoder for 3D shapes with a full-resolution shape feature extractor and a multi-resolution geometric feature extractor at different scales, which can enhance the geometric variability and scalability of the latent representation. Then, the proposed mixer, i.e., a joint latent space, can synergically integrate and complement the encoded features from 3D geometry and 2D contents through our intermodality feature mapping and intramodality feature consistency design. It is noted that our joint latent space can simultaneously generate multimodal representations and correlations with high-quality, high-fidelity, and high cross-modality similarity, which the traditional single-modal methods cannot handle. The extensive experiments demonstrate that our approach outperforms the state-of-the-art methods in shape auto-encoding as well as simultaneous multimodal (SMM) shape and color image generation and interpolation, etc. Furthermore, our joint-learning of 2D and 3D facets of a shape for the novel SMM semantic-aware generation task can enhance the capability of the corresponding single-modality and single-tasking to the next level.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.