The energy management strategies (EMS) of hybrid electric vehicles (HEVs) are put to the test by complex off-road driving conditions. To this end, a novel model-based reinforcement learning (MBRL) algorithm, namely heuristic search, is proposed for EMS. In the MBRL framework, with an online learning Markov Chain (MC) representing the stochastic driving conditions, and a nonlinear state space model describing the deterministic powertrain, an RL model for the HEV is constructed first. Then, heuristic search is introduced to solve the energy management problem, which has two significant advantages: 1) it centers on searching for the optimal action for every current state online; 2) a heuristic function derived from previous experiences is utilized to accelerate the learning. Thus, the optimal actions in each HEV state are learned in real-time, improving the EMS's adaptability to various driving conditions. In the simulation, the proposed EMS is compared with model-free Q-learning (MFQL), model-based Q-learning (MBQL) and dynamic programming (DP) in both off-road driving cycle and standard cycles. Results show that heuristic search only costs about 30 % computing time of MBQL and can maintain better performance than MFQL in various driving conditions.