The deep Convolutional Neural Network (CNN) architecture used in this research study provides a proof of concept for crack detection on the metallic surface of a hex nut. The goal is to create an automated receiving inspection process to supplement human inspections conducted on-site. Conventional image processing techniques (IPTs) have been extensively used for mechanical infrastructure fault detection. These techniques focus on image modification to extract typical features, such as surface fractures in materials like steel and concrete. However, obstacles presented by a variety of real-world variables, such as changes in lighting and shadows, make it difficult to use IPTs. Our suggested vision-based method employs a deep learning CNN to overcome these difficulties, eliminating the need to explicitly compare fault features. CNNs are more resilient to shifting real-world situations than IPTs since they are naturally trained to identify characteristics in images. Following training on a dataset of 1081 images with dimensions of 256 x 256 pixels, the VGG16 CNN architecture achieved an impressive accuracy of around 94.17%. Additional CNN architectures, including ResNet, MobileNet, AlexNet, and LeNet-5, are employed to assess and compare fault detection accuracies in order to select the most appropriate architecture for the model. To evaluate the robustness and flexibility of the suggested method in various situations, we conducted tests with 206 images from an alternative structure that was not part of the training dataset. These images depicted a range of circumstances, such as intense light patches and tiny fissures. The outcomes showed that our proposed method outperforms current approaches, highlighting its usefulness in practical situations involving the identification of metallic defects.