As urbanization continues to accelerate, effectively managing peak electricity demand becomes increasingly critical to avoid power outages and system overloads that can negatively impact both buildings and power systems. To tackle this challenge, we propose a novel model-free predictive control method called “Dynamic Dual Predictive Control-Deep Deterministic Policy Gradient (D2PC-DDPG)" based on a deep reinforcement learning framework. Our method employs the Deep Forest-Deep Q-Network (DF-DQN) model to predict electricity demand across multiple buildings, and based on the output of the DF-DQN model, applies the Deep Deterministic Policy Gradient (DDPG) algorithm to optimize coordinated control of energy storage systems, including hot and chilled water storage tanks in multiple buildings. Experimental results show that our proposed DF-DQN model outperforms other traditional machine learning, deep learning, and reinforcement learning methods in terms of prediction accuracy, such as mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). Moreover, our D2PC-DDPG method achieves superior control performance and peak load reduction compared to other reinforcement learning methods and an RBC-based control method. Specifically, our method successfully reduced peak load by 27.1% and 21.4% over a two-week period in the same regions. To demonstrate the generalizability of our D2PC-DDPG method, we tested it in five different regions and compared its performance with an RBC-based control method. The results showed that our method achieved an average reduction of 16.6%, 7%, 9.2%, and 11% for ramping, 1-load_factor, average_daily_peak, and peak_demand, respectively. These findings demonstrate the effectiveness and practicality of our proposed method in addressing critical energy management issues in various urban environments.