Deep reinforcement learning (DRL) is currently the cutting-edge artificial intelligence approach in the field of energy management for hybrid electric vehicles. However, inefficient offline training limits the energy-saving efficacy of DRL-based energy management strategies (EMSs). Motivated by this, this article proposes a smart DRL-based EMS in a heuristic learning framework for an urban hybrid electric bus. In order to enhance the sampling efficiency, the prioritized experience replay technique is introduced into soft actor-critic (SAC) for the innovative formulation of an improved SAC algorithm. Additionally, to strengthen the generalizability of the improved SAC agent to real driving scenarios, a stochastic training environment is constructed. Afterward, curriculum learning is employed to develop a heuristic learning framework that expedites convergence. Experimental simulations reveal that the designed EMS expedites convergence by 85.58 % and saves fuel by 6.43 % compared with the cutting-edge baseline EMS. Moreover, the computation complexity test demonstrates that the designed EMS holds significant promise for real-time implementation. These findings highlight the contribution of this article in facilitating fuel conservation for urban hybrid electric buses through the application of emerging artificial intelligence technologies.