The unsupervised detection and localization of image anomalies hold significant importance across various domains, particularly in industrial quality inspection. Despite its widespread utilization, this task remains inherently challenging due to its reliance solely on defect-free normal knowledge. This paper presents the local–global normality learning and discrepancy normalizing flow, a new state-of-the-art model for unsupervised image anomaly detection and localization. In contrast to existing methods, It adopts a two-stream approach that considers both local and global semantics, ensuring stable detection of abnormalities. The framework comprises two key components: the dual-branch Transformer and the discrepancy normalizing flow, facilitating reconstruction and discrimination. The proposed framework leverages pre-trained convolutional neural networks to extract multi-scale feature embeddings, followed by a novel dual-branch transformer that achieves feature reconstruction from local and global perspectives. The local reconstruction employs self-attention, while the global reconstruction incorporates global prototype tokens and semantic query tokens by the aggregation-cross attention mechanism. Moreover, discrepancy normalizing flow is developed to estimate the likelihood of anomalies based on the discrepancy between pre-trained features and local/global reconstruction results. Extensive validation on established public benchmarks confirms that our method achieves state-of-the-art performance with the proposed local–global reconstruction and discrimination dual-stream framework.
Read full abstract