Hydrogenerators are strategic assets for power utilities. Their reliability and availability can lead to significant benefits. For decades, monitoring and diagnosis of hydrogenerators have been at the core of maintenance strategies. The main cause of hydrogenerator breakdown comes from failure of its high voltage stator, which is a major component of hydrogenerators. In our previous study, it was shown that more than 85% of stator failure mechanisms indicate the presence of Partial Discharges (PD) activity. PD are minute sparks that occur within voids inside high voltage insulation or in the air around the insulating system. Each PD event does not cause immediate failure, but it will slowly erode the insulation system and will lead to breakdown in years to decades. PD signal can be detected from the lead of the hydrogenerator while it is running, thus allowing for on-line diagnosis. Hydro-Québec has been collecting more than 33.000 unlabeled PD measurement files over the last decades including two types of measurement instruments. One of the instruments used is the Partial Discharge Analyzer (PDA). PDs have several different sources and each source is characterized by a specific signature. It is therefore essential to be able to automatically recognize the nature of the DPs in order to anticipate possible degradation. Expert rules to characterize these PD signals exist but these rules cannot be used as an automatic classification tool. Indeed, for certain ambiguous cases (conflict between classes) or the presence of several signatures at the same time (multi-classes), the expert's judgement remains essential.
 A Variationel AutoEncoder (VAE) is used for dimension reduction and projection into a 2D latent space to analyze the training data and then classify it. The 2D latent space of the VAE allows the data space to be restructured and reorganized to anticipate the performance of the classifier. The problem is how to optimize this latent space and thus obtain the best distribution of the different sources of PDs to maximize the chances of good responses from the classifier? To our knowledge, there is currently no method in the literature that clearly answers this question. What is certain, however, is that the optimization of the VAE learning is directly related to the quality of the learning base, i.e. a large size and a perfectly balanced database (all the cases are present and sufficiently represented). The objective of this paper is to compare the quality of the latent space obtained from the experts' rules with a latent space obtained directly from the input signal in an « End-to-End » approach.
 The first method concerns an original unsupervised deep learning method for PD recognition. Instead of labelling the PD measurement files by the expert for a supervised learning process, we use the rules developed by the experts of Hydro-Québec to create a feature vector from recognizable PD signatures. Indeed, labelling a sufficient quantity of signatures for a supervised approach is very time-consuming and therefore cannot be implemented. This is a common problem in the industry where more and more operational data is available but cannot be labelled by experts, who are busy with other tasks. Expert knowledge is then injected into a characteristic and feature vector. In the second method, several Generative Adversarial Networks (GANs) associated with several types of PDs are thus used to generate artificial signals in order to increase the size of the learning base and especially to balance it. A new latent space is thus obtained from the learning of the VAE exclusively carried out on data generated by the GANs.
 Validation tests based on the 33,000 measurements with metrics for evaluating the performances of the various latent spaces are used.