Electric vehicle (EV) is emerging as an effective choice to reduce carbon emissions in the modern transportation system. However, the large access of EVs brings tremendous passive influence on the stability and performance of power grid operation. Thus, it is valuable to investigate the charging power management method to improve charging experience of EV drivers while maintaining normal bus voltage of the power system. This paper builds the charging power control problem of multiple EVs as a hierarchical Markov decision process (MDP) model and propose a hierarchical multi-objective reinforcement learning method (Hamlet) to obtain the real-time charging power control decisions. Specifically, by treating EVs with the same remaining charging time as the same agent, this paper addresses the dynamic state and action spaces issue and significantly reduce the input dimension of both state and action spaces. Besides, this paper quantifies drivers’ anxiety about the battery volume and distributes the comfort reward over the entire charging interval, which overcomes the delayed comfort reward issue. In order to alleviate the influence of multiple objectives on training stability, this paper develops a multi-objective learning method to dynamically adjust the optimization direction. Simulation results on IEEE 33 bus and 69 bus test feeders prove the validity of the proposed method in mitigating voltage fluctuation, minimizing bill payment and maximizing battery comfort.
Read full abstract