Energy management strategies typically employ reinforcement learning algorithms in a static state. However, during vehicle operation, the environment is dynamic and laden with uncertainties and unforeseen disruptions. This study proposes an adaptive learning strategy in dynamic environments that adapts actions to changing circumstances, drawing on past experience to enhance future real-world learning. We developed a memory library for dynamic environments, employed Dirichlet clustering for driving conditions, and incorporated the expectation maximization algorithm for timely model updating to fully absorb prior knowledge. The agent swiftly adapts to the dynamic environment and converges quickly, improving hybrid electric vehicle fuel economy by 5–10% while maintaining the final state of charge (SOC). Our algorithm’s engine operating point fluctuates less, and the working state is compact compared with Deep Q-Network (DQN) and Deterministic Policy Gradient (DDPG) algorithms. This study provides a solution for vehicle agents in dynamic environmental conditions, enabling them to logically evaluate past experiences and carry out situationally appropriate actions.