Abstract Learning meaningful representations of images in scientific domains that are robust to variations in centroids and orientations remains an important challenge. Here we introduce centroid- and orientation-aware disentangling autoencoder (CODAE), an encoder–decoder-based neural network that learns meaningful content of objects in a latent space. Specifically, a combination of a translation- and rotation-equivariant encoder, Euler encoding and an image moment loss enables CODAE to extract features invariant to positions and orientations of objects of interest from randomly translated and rotated images. We evaluate this approach on several publicly available scientific datasets, including protein images from life sciences, four-dimensional scanning transmission electron microscopy data from material science and galaxy images from astronomy. The evaluation shows that CODAE learns centroids, orientations and their invariant features and outputs, as well as aligned reconstructions and the exact view reconstructions of the input images with high quality.
Read full abstract