The continuous increase in global electricity demand has resulted in boiler power plants becoming a significant energy source. The production of steam is a principal indicator of boiler efficiency, and the accurate prediction of steam production is paramount importance for the enhancement of boiler efficiency and the reduction of operational costs. In this study employs a boiler dataset with a steam production capacity of 420 tons per hour. A total of 25 independent variables were extracted from the original 39 variables through data processing and feature engineering for the purpose of prediction analysis. Subsequently, 8 machine learning models were used for modeling predictions. Grid search cross-validation was employed in order to optimise the performance of the model. The models were analysed and assessed using the Mean Squared Error (MSE) metrics. The results show that random forest achieves the highest accuracy among the 8 single models. Based on 8 models, New Bagging ensemble model is proposed, which combined predictions from 8 single models, demonstrated the optimal overall fit and the lowest MSE, achieved the purpose of the research. The present study demonstrates the ability to analyse and predict complex industrial systems with machine learning algorithms, and provides insights into the use of machine learning algorithms for industrial big data analytics and Industry 4.0. Further work could explore using larger datasets and deep learning to make predictions more accurate.
Read full abstract