Structural defects account for a large proportion of defects, and acquiring large batches of high-quality labels is labor-intensive and time-consuming for industrial visual defect inspection tasks. This paper addresses the above problem by exploiting sufficient unlabeled samples, and aims to achieve superior model performance with some labeled data by using self-training methods that incorporate positional information. Specifically, this paper proposes a novel self-training architecture, MixSiam, which uses a Multi-Position-based Mix strategy (MPMix) and Siamese network structure for defect classification. Furthermore, considering the prediction noise problem in unlabeled data during training, we propose a progressive MPMix (MPMix+) strategy to reduce the negative impacts of noise on model training. Finally, we validate the effectiveness of our architecture on industrial datasets. For example, our method achieves 71.40% and 87.01% accuracy on the SMT (Surface Mounting Technology) dataset and MBH (Motor Brush Holder) dataset with only 100 labeled samples, which are 2.40% and 5.86% higher than the state-of-the-art FixMatch method, respectively. Compared with the supervised algorithm with 3,600 labels, our method achieves comparable accuracy on the SMT and MBH datasets, respectively, while saving 2/3 the amount of labeled data. In conclusion, MixSiam effectively utilizes unlabeled industrial data and improves model accuracy with fewer labeled samples, thus reducing the burden of data annotation in industrial production.