Abstract
Anomaly detection is an active research field in industrial defect detection and medical disease detection. However, previous anomaly detection works suffer from unstable training, or non-universal criteria of evaluating feature distribution. In this paper, we introduce UTRAD, a U-TRansformer based Anomaly Detection framework. Deep pre-trained features are regarded as dispersed word tokens, and represented with transformer-based autoencoders. With reconstruction on more informative feature distribution instead of raw images, we achieve a more stable training process and a more precise anomaly detection and localization result. In addition, our proposed UTRAD has a multi-scale pyramidal hierarchy with skip connections that help detect both multi-scale structural and non-structural anomalies. As attention layers are decomposed to multi-level patches, UTRAD significantly reduces the computational cost and memory usage compared with the vanilla transformer. Experiments on industrial dataset MVtec AD and medical datasets Retinal-OCT, Brain-MRI, Head-CT have been conducted. Our proposed UTRAD out-performs all other state-of-the-art methods in the above datasets. Code released at https://github.com/gordon-chenmo/UTRAD.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.