Abstract

This study proposes a vision transformer to detect visual defects on steel surfaces. The proposed approach utilizes an open-source image dataset to classify steel surface conditions into six fault categories namely, crazing, inclusion, rolled in, pitted surface, scratches and patches. The defect images are first subject to resizing and then fed into a vision transformer subject to different hyperparameter configurations to determine the most optimal setting to render highest classification performance. The performance of the model is evaluated for different hyperparameter configurations, and the most optimal configuration is examined using the associated confusion matrices. It was observed that the proposed model presents a high overall accuracy of 96.39% for detection and classification of steel surface faults. The study presents a descriptive insight into the vision transformer architecture and in addition, compares the performance of the current model with the results of other approaches suggested for application in literature. Vision transformers can serve as standalone approaches and suitable alternatives to the widely used convolution neural networks (CNNs) by actuating complex defect detection and classification tasks in real-time, enabling efficient and robust condition monitoring of a wide range of defects.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.