ABSTRACT Wind turbine blade icing seriously affects power generation performance and fatigue life, and effective diagnosis of blade icing is critical for mitigating icing effects. Current diagnostic methods are greatly affected by unbalanced data, especially for small sample data. To effectively solve the above problems, a novel diagnostic method combining k-means clustering with a label propagation algorithm is proposed. Specifically, k-means clustering handles unlabeled SCADA data, generating initial pseudo-labels. Then, the label propagation algorithm refines these pseudo-labels, enhancing labeling accuracy and overall classification performance. Finally, the effectiveness of the proposed method is validated using four different type classifiers for two wind farms. The results show that the proposed method improves the average diagnostic accuracy by 3.38% compared to models that eliminate unlabeled data and with a 4.2% improvement in small sample scenarios. The results demonstrate that the method exhibits high accuracy and significant generalization ability in diagnosing blade icing, offering practical benefits for data analysis and fault diagnosis.