On failure prediction and failure identification modeling in a gas turbine system: a survey of classification approaches in a three-class problem

Catherine Cheung,Zouhair Hamaimou,Calista Biondic,Julio Valdes

doi:10.36001/phmconf.2021.v13i1.3052

Catherine Cheung, Zouhair Hamaimou + Show 2 more

Open Access

https://doi.org/10.36001/phmconf.2021.v13i1.3052

Copy DOI

Abstract

Rapid developments in sensor technology, data processing tools and data storage capability have helped fuel an increased appetite for equipment health monitoring in mechanical systems. As a result, the number of sensors and amount of data collected for health monitoring has grown tremendously. It is hoped that by collecting large quantities of operational data, predictive tools can be developed that will provide operational, maintenance and safety benefits. Data mining and machine learning techniques are important tools in addressing the ensuing challenge of extracting useful results from the data collected. In this work, the sensor data from a gas turbine system was analyzed with the objective of failure modeling and prediction. Previous efforts had used a two-class approach for this problem, to distinguish healthy and failed states of the system. In this work, a third class labelled as deteriorated data is added prior to each failure event to explore the ability of machine learning models to provide early warning of upcoming incidents. Several maintenance incidents were recorded by the sensor system in two separate vehicles. Three approaches to selecting training data were used. The first followed a traditional method of randomly selecting data points from all data according to a desired percentage of failed data to include in training, target ratios between failed and healthy data in each data set, as well as target ratios between training and testing data. The second data selection strategy was to consider data related to failure incidents as a whole and select certain incidents to include in training, and the remaining ones to be unseen in testing. The third approach was cross-validation which is typically used as a technique to evaluate how a classifier will perform on unseen data while still using the entirety of the data to train the final classifier. In addition to investigating training and data selection strategies, the effect of hyperparameter optimization was explored as well as the effect of varying the time period of the deteriorated class. Using the gas turbine data, which included 7 failure incidents and 76 predictor variables, a variety of classifier models of the system were developed in a three-class problem to differentiate healthy, deteriorated and failed system states. The classifier methods included support vector machines, Gaussian Naïve Bayes, random forest, adaboost, multilayer perceptron, k-nearest neighbor, and XG boost. Ensemble models were also created to leverage all the individual classifier models that were developed. This paper will describe the comprehensive results that were obtained using the various approaches and combinations, highlighting the respective benefits and limitations.

Full Text