Surface Defect Classification with Vision Transformer

Jihai Zhao

doi:10.1109/icid57362.2022.9969746

Abstract

Surface defect detection is widely used in the manufacturing industry. Image-based surface defect classification has been proposed as a promising approach. Deep learning works well for various problems. However, for the problem of surface defect detection considered in this study, convolutional neural networks (CNNs) are not the best approach for all situations, and their performance can be further improved. The accurate defect detection provided by CNNs is high depending on the large dataset. When the dataset is small with limited samples, CNN performance cannot be guaranteed. Instead, a Vision Transformer model has been used in this paper. Experiments show that the Vision Transformer model performs better than CNN baselines (including VGG19, DenseNet, and ResNet) on an aluminum surface defect dataset.

Full Text