Accurate prediction of coking product prices is crucial for enhancing production efficiency, cost optimization, and profit maximization in smart coking facilities. To address the volatility caused by nonlinear factors such as raw material costs, substitutes, macroeconomic indicators, sudden events, policy changes, and market behaviors, we propose a novel integrated prediction method for coking product price prediction. This method combines Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for signal decomposition, Bidirectional Encoder Representations from Transformers (BERT) for natural language processing, attention mechanisms (AT) to weigh feature importance, and an ensemble of Bidirectional Gated Recurrent Unit, Bidirectional Long Short-Term Memory, and Gated Recurrent Unit, abbreviated BBG, for robust feature extraction. We design a feature selection strategy to avoid data leakage and improve the predictive ability of the model, and describe a method to maintain textual data information integrity when combining data from different sources. Experimental results on coke and methanol datasets show that our approach retains multi-source text richness improves predictive capability, and outperforms other state-of-the-art methods, providing an effective tool for developing smart coke plants.
Read full abstract