Abstract

Many problems in computer vision today are solved via deep learning. Tasks like pose estimation from images, pose estimation from point clouds or structure from motion can all be formulated as a regression on rotations. However, there is no unique way of parametrizing rotations mathematically: matrices, quaternions, axis‐angle representation or Euler angles are all commonly used in the field. Some of them, however, present intrinsic limitations, including discontinuities, gimbal lock or antipodal symmetry. These limitations may make the learning of rotations via neural networks a challenging problem, potentially introducing large errors. Following recent literature, we propose three case studies: a sanity check, a pose estimation from 3D point clouds and an inverse kinematic problem. We do so by employing a full geometric algebra (GA) description of rotations. We compare the GA formulation with a 6D continuous representation previously presented in the literature in terms of regression error and reconstruction accuracy. We empirically demonstrate that parametrizing rotations as bivectors outperforms the 6D representation. The GA approach overcomes the continuity issue of representations as the 6D representation does, but it also needs fewer parameters to be learned and offers an enhanced robustness to noise. GA hence provides a broader framework for describing rotations in a simple and compact way that is suitable for regression tasks via deep learning, showing high regression accuracy and good generalizability in realistic high‐noise scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call