Electrical power output (PE) for a combined cycle gas turbine (CCGT) consisting of 9568 data records collected over a 6-year period is evaluated by the transparent open box (TOB) machine-learning method to provide accurate PE predictions and insight to prediction errors. The PE predictions derived by applying the TOB optimized data matching technique are more accurate than published predictions for the dataset from fifteen correlation-based, machine-learning algorithms. TOB achieves this high-accuracy using a tuning subset of < 150 (~ 1.5%) data records. Its accuracy is confirmed by testing the optimized solutions against all dataset records in 15 runs spread across five shuffled datasets. The dataset has a few extreme outliers associated with its four independent variables and these negatively impact the prediction accuracy of machine-learning methods. Through its transparency and forensic-like auditability of its calculations for individual data records, the TOB algorithm is able to mine the dataset to provide useful insight to the interactions of the outliers with other data records. This enables a filtered dataset (9533 records), excluding 35 carefully selected data records, to be customized to deliver much improved prediction accuracy (RMSE = 2.89 MW). Mining the dataset also reveals significant differences in prediction accuracy achieved for different sectors of the PE distribution. This insight identifies that prediction accuracy could be further improved by dividing the dataset into separately optimized subsets, three along its main PE trend plus a fourth, small subset consisting of the outliers. The TOB algorithm demonstrates its value as a machine-learning tool capable of generating accurate predictions and easily auditable data mining. It is well suited for CCGT efficiency and performance optimization.
Read full abstract