Abstract
In the latest years, deep learning has been massively used to face problems that have not been solved by means of classical approaches. In particular, an autoencoder is a popular unsupervised artificial neural network that learns efficient data representations (encoding) by training the network to ignore features with a small content of information. Even though autoencoders over-perform classical techniques in several applications like anomaly detection, dimensionality reduction, features denoising, and missing values imputation, the literature does not provide a commonly accepted methodology to define the optimal amount of data needed to train the model. This paper proposes a procedure to determine the optimal train-set size to minimize the reconstruction error of an autoencoder with pre-defined structure and hyper-parameters that will be trained to encode the normal behavior of energy generation systems. This procedure exploits the outcome of learning curves, a powerful tool to track algorithms performance while the train-set dimension varies. Afterward, the procedure is applied to three real case studies where two types of autoencoders are trained to learn the normal behavior of a YANMAR combined heat and power unit with the scope of detecting incoming anomalies. In the end, the outcomes of the procedure are explained and, under the constraint of a daily retraining frequency, 6 weeks are identified as the optimal train-set size for both autoencoders.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.