The estimation of object orientation from RGB images is a core component in many modern computer vision pipelines. Traditional techniques mostly predict a single orientation per image, learning a one-to-one mapping between images and rotations. However, when objects exhibit rotational symmetries, they can appear identical from multiple viewpoints. This induces ambiguity in the estimation problem, making images map to rotations in a one-to-many fashion. In this paper, we explore several ways of addressing this problem. In doing so, we specifically consider algorithms that can map an image to a range of multiple rotation estimates, accounting for symmetry-induced ambiguity. Our contributions are threefold. Firstly, we create a data set with annotated symmetry information that covers symmetries induced through self-occlusion. Secondly, we compare and evaluate various learning strategies for multiple-hypothesis prediction models applied to orientation estimation. Finally, we propose to model orientation estimation as a binary classification problem. To this end, based on existing work from the field of shape reconstruction, we design a neural network that can be sampled to reconstruct the full range of ambiguous rotations for a given image. Quantitative evaluation on our annotated data set demonstrates its performance and motivates our design choices.
Read full abstract