Abstract

Large amounts of unlabeled data are produced from wind turbine condition monitoring systems to catch their operational status. With this unmanageable amount of data, developing robust systems with good performance on unseen test data to detect incipient wind turbine faults is crucial to maximizing wind farm performance. This paper presents an implementation of a robust unsupervised machine-learning approach capable of executing fleet-based anomaly detection in wind turbines’ critical components. The proposed methodology is applied to noisy, unlabeled, and unstructured vibration data, which must go through the databank decoding, data engineering, preprocessing, and feature extraction. Twelve operational wind turbines with varying health conditions are used to train, validate, and test the models. Features from different domains (time, frequency, and mechanical domain) are extracted and represented in the model’s input. A labeling procedure from expert analysis regarding the condition of each wind turbine component through the evaluation of CMS output was carried out. Combining distinctive approaches to optimize eleven unsupervised machine learning algorithms through an unusual 5×2 cross-validation approach applied to real, noisy, and unstructured wind turbine data represents the paper’s novelty. The methodology selected the six best models (k-nearest neighbors, clustering-based local outlier, histogram-based outlier, isolation forest, principal component analysis, and minimum covariance determinant) based on robust performance metrics such as accuracy, F1-score, precision, recall, and area under the ROC (Receiver Operating Characteristic Curve). These models generalized the problem well and returned reasonable classification metrics for such a complex problem, with values above 90% for the area under the ROC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call