Surface defect detection is crucial in the industrial domain, but deploying deep learning models on edge devices faces challenges of memory constraints and real-time requirements. Therefore, model pruning is necessary to reduce the model’s parameter count and improve inference speed. Due to the high detection accuracy demanded in industrial defect detection, it is essential to prune the network while ensuring minimal impact on network accuracy. Different industrial defect scenarios may exhibit defects with varying importance of features and scale representation. Hence, to preserve the network’s capability to recognize various defect features, pruning methods need to be adaptive to the dataset and effectively balance the remaining feature information across different scale layers in various scene contexts. To tackle these challenges, we propose a structured pruning method tailored for industrial detection scenarios, which significantly reduces model computation and parameter count while preserving network accuracy. Specifically, we design a scoring network to guide the allocation of pruning amounts for each layer. This network evaluates the importance of different scale layers globally based on the performance of various defect features within the dataset. Additionally, we employ the L1 norm criterion to prune filters with low weights that minimally affect accuracy. We also consider filter correlation and introduce a comprehensive evaluation method that incorporates both filter contribution and substitutability. This enables pruning of filters with low impact on detection accuracy. Finally, we obtain the most lightweight network through an iterative “prune-retraining” process. Our method is evaluated on four datasets using segmentation networks, achieving over 80% reduction in floating-point operations and 90% reduction in parameters while maintaining accuracy, thus significantly improving model memory usage and inference speed.