Abstract
AbstractOptimal viewpoint prediction is an essential task in many computer graphics applications. Unfortunately, common viewpoint qualities suffer from two major drawbacks: dependency on clean surface meshes, which are not always available, and the lack of closed‐form expressions, which requires a costly search involving rendering. To overcome these limitations we propose to separate viewpoint selection from rendering through an end‐to‐end learning approach, whereby we reduce the influence of the mesh quality by predicting viewpoints from unstructured point clouds instead of polygonal meshes. While this makes our approach insensitive to the mesh discretization during evaluation, it only becomes possible when resolving label ambiguities that arise in this context. Therefore, we additionally propose to incorporate the label generation into the training procedure, making the label decision adaptive to the current network predictions. We show how our proposed approach allows for learning viewpoint predictions for models from different object categories and for different viewpoint qualities. Additionally, we show that prediction times are reduced from several minutes to a fraction of a second, as compared to state‐of‐the‐art (SOTA) viewpoint quality evaluation. Code and training data is available at https://github.com/schellmi42/viewpoint_learning, which is to our knowledge the biggest viewpoint quality dataset available.
Highlights
ObjectivesWe aim to resolve this problem by moving the label decision from a preprocessing step into the training process, by making it dependent on the current network prediction
To demonstrate the proposed deep learning technique, we have considered four different viewpoint quality measures, which we selected based on their effectiveness in previous studies and their popularity: Viewpoint Entropy (VE) [VFSH01], Visibility Ratio (VR), referred to as surface area [PB96], Viewpoint Kullback-Leibler divergence (VKL) [SPFG05], and Viewpoint Mutual Information (VMI) [FSG09], which are defined as: VE
We demonstrate the effectiveness of our two stage dynamic label generation (ML+Gaussian Labels (GL)) by comparing it against single label cosinedistance (SL) and existing work on resolving label ambiguity, Deep Label Distribution Learning (DLDL) [GXX∗17], to directly predict the viewpoint quality distribution, and Spherical Regression (SR) [LGS19], which splits the optimization into two parts, a regression for the absolute value |v| and a classification task for the signs
Summary
We aim to resolve this problem by moving the label decision from a preprocessing step into the training process, by making it dependent on the current network prediction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.