Abstract

The performance of reinforcement learning-based energy management system for a pure hybrid electric vehicle critically depends on the articulation of immediate reward function. The current brief systematically unveils the fundamental reliance of reinforcement learning-based agent’s performance on the articulation of immediate reward function. Third generation Toyota hybrid system is chosen as the electrified powertrain for formulating the energy management problem. An asynchronous advantage actor-critic-based reinforcement learning framework is chosen as the control strategy for the energy management system of the aforementioned powertrain. The chosen powertrain architecture offers two degrees-of-freedom, i.e., engine speed and engine torque. Since reinforcement learning agent is solely responsible for controlling these two variables over a given drive cycle without any tactical controllers, reinforcement learning-based agent not only has to find the near-optimal trajectory for the control variables, but should also consider the feasibility criteria for practical operation. Since reinforcement learning agent chooses the control variables randomly without any feasibility check, immediate reward function should be articulated in such a way so that the agent is discouraged to choose any control variable resulting in infeasible powertrain operation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call