Reliable prediction of building energy consumption is essential for effective and sustainable energy management. Traditional forecasting methods, however, face challenges when dealing with sudden changes in energy consumption. To address the above-mentioned issue, we introduce the GWO-ANFIS-RD3PG prediction model in this paper. This model aims to ensure global forecasting accuracy while minimizing prediction errors at sudden change points. A Grey Wolf Optimization (GWO) algorithm with the Adaptive Neuro-Fuzzy Inference System (ANFIS) is proposed to predict the energy consumption type and its fluctuation magnitude for the next time step. Within the Recurrent Dual Experience Replay Buffer DDPG (RD3PG) prediction module, we employ Long Short-Term Memory (LSTM), as the actor network in the Deep Deterministic Policy Gradient (DDPG), to account for long-term dependencies in time series data. Notably, we introduce an adaptive action modification mechanism that allows the agent to optimize actions based on the output of the aforementioned ANFIS, enabling real-time adjustments during sudden change. To further enhance the model’s performance, we adopt a dual experience replay mechanism to store data samples from sudden change points, capturing insight information during these anomalies. Experimental results on two real-world datasets demonstrate that this method reduces MAE by 5.81% and 13.47%, and RMSE by 5.69% and 15.21% compared to the state-of-art models, showing the significance of the proposed method for energy efficiency improvement in real-world building operations.