Data-driven modeling methods are extensively employed in the vast landscape of petrochemical industries. Due to their accuracy and rapidity, these models play a guiding role in the simulation and control of industrial petrochemical processes, such as the fluid catalytic cracking (FCC) process. However, in industrial process, there is a high degree of nonlinearity between influencing variables and predictive variables, making it challenging for models to accurately capture the relationship. Additionally, there are temporal and spatial characteristics, which conventional models struggle to learn. To tackle this challenge, a novel hybrid data-driven modeling is proposed, which combines variational mode decomposition (VMD) and dual-stage attention long short-term memory (DA-LSTM), incorporating error compensation (EC). VMD decomposes the predictive variables into multiple components to alleviate the nonlinear between the influencing and predictive variables. Each component is predicted by the DA-LSTM model, a dual-stage self-attention model that incorporates feature and time attention mechanisms. The EC method predicts the error term, primarily originating from residuals in the VMD decomposition and inaccuracies in the predictions made by DA-LSTM. The superiority of the proposed model is verified by the prediction of industrial data in FCC, Tennessee Eastman and debutanizer column processes.