Detecting surface defects in the industry is essential for improving the quality of industrial products and maintaining product safety. However, problems such as the similarity of defects, significant variation in the scale of the target object, and the balance between detection speed and accuracy in industrial inspection scenarios have been considerable research topics in this field. This paper proposes an industrial defect detection network based on convolutional attention-guided and aggregated multiscale features to address these issues (ICA-Net). Firstly, for similarity defects in complex backgrounds, this paper proposes a backbone network with a combination of lightweight convolutional blocks and self-attentive modules to fully extract images’ local and global information and enhance the network’s expressiveness. Secondly, to make full use of the shallow fine-grained features and deep semantic features of the backbone network to improve the detection capability of defects with significant scale changes, this paper designs a cross-layer multiscale feature fusion network (CEF-Net), which fully fuses the features of adjacent layers and cross-layers through a reweighting feature strategy to enrich the network feature transfer path and ensure the efficient fusion of different scale features in the network. At the same time, the fine-grained feature fusion module (FFM) is used to fuse elements from multiple layers to extract more contextual information, enhance the extraction of fine-grained features and improve the detection capability of complex small targets. Finally, to address the problems of inaccurate regression localization and low detection accuracy of defects in existing industrial algorithms, a new IoU loss function (G-IOU) is proposed for regressing the intersection part of the predicted frame and the actual structure according to the aspect ratio of the real frame during the model regression to improve the accuracy and stability of detection. The experimental results show that 94.1%, 98.6%, 99.4%, 98.8% and 96.5% of mAP@.5 are obtained on steel, PCB, aluminium, automobile and Xsteel steel metal surface defect datasets, respectively, and 48 FPS is achieved, which is superior to the current mainstream detectors and meets the needs of practical industrial production.