Deep reinforcement learning stands as a powerful force in the realm of intelligent control for hybrid power systems, yet some imperfections persist in the positive progression of learning-based strategies, necessitating the proposal of essential solutions to address these flaws. Firstly, a public and reliable benchmark model for hybrid powertrains and the optimization results of energy management strategies are essential. Hence, two Python-based standard deep reinforcement learning agents and four Simulink-based hybrid powertrains are employed, forming a co-simulation training approach as the reliable solution. Secondly, a detailed analysis from the perspectives of range, magnitude, and importance reveals that the optimization terms in traditional reward functions can mislead the agent during the training process and require cumbersome weight tuning. Accordingly, this paper proposes a novel training idea that combines the rule-based engine start-stop with an unweighted reward tailored for optimizing engine efficiency and facilitating training progress. Finally, a hardware-in-the-loop test is performed, treating the P2 hybrid electric vehicle as the target. The results show that two deep reinforcement learning-based energy management strategies achieved fuel economies of 6.537 L/100 km and 6.330 L/100 km, respectively, and more efficient and reasonable control sequences ensure the working state of the engine as well as the state of charge of batteries.