Deep learning approaches have been successfully applied to perform automatic classification of phase-resolved partial discharge (PRPD) diagrams. Under the supervised learning paradigm, however, the performance of classifiers strongly depends on the availability of large and previously labeled data sets. Labeling is an intensive and time-consuming labor, typically involving the manual annotation of a large number of data examples by an expert. In this work, we propose a label propagation algorithm applied to PRPD data sets, aiming to reduce the time necessary to manually label PRPDs. Our basic pipeline is composed of three phases: pre-processing, dimensionality reduction procedures, and clustering. Different configurations of the basic pipeline are tested by using PRPDs obtained from online measurements in hydrogenerators. The performance of each configuration is assessed by using the Silhouette, Caliński–Harabasz, and Davies–Bouldin scores. The clustering of the best three configurations is compared with annotated PRPDs by using the Fowlkes-Mallows index. Results suggest our strategy can substantially decrease the time for manual labeling.
Read full abstract