Aiming at the shortcomings of most current anomaly detection models, such as low detection accuracy and poor generalization performance, this paper proposes a few-shot anomaly detection model based on a convolutional multidimensional attention module to achieve feature registration (abbreviated as RCM-FSAD), which enhances the model’s perception of the overall image perception ability, using spatial transformer network to obtain the spatial transformation features of the image, improving the sensitivity of the relevant features, so that the whole model learns the commonality between the categories, and enhancing the generalization ability of the model. The spatial transformations and local structures of the input data are captured by deformable convolutional networks v2 to ensure the spatial invariance of the input data. The model is trained with only normal samples to accomplish anomalous regions’ localization and anomaly detection. On the challenging MVTec AD dataset, the unsupervised model not only improves the anomaly detection accuracy but also shows better generalization compared to current state-of-the-art unsupervised anomaly detection methods.
Read full abstract