Abstract

Eco-driving represents a promising avenue for mitigating energy consumption in road transportation. To enhance the applicability of learning-based eco-driving strategies, this study presents a novel framework that employs offline reinforcement learning in eco-driving control. This framework enables a vehicle agent to acquire eco-driving behavior by leveraging real-world human driving trajectories. Specifically, the human driving trajectories, along with the corresponding traffic signal timing scheme, obtained from empirical data, are utilized to construct a comprehensive Markov Decision Process (MDP) dataset for offline policy training. To accommodate learning from sub-optimal human-driving data, a Conservative Q-learning (CQL) algorithm is deployed. Subsequently, the proposed offline learning method is compared with alternative learning-based, model-based, and rule-based approaches, effectively illustrating the feasibility of offline learning and the efficacy of the CQL algorithm. Notably, the energy consumption is demonstrated to be improved by 67.3% compared to a behavioral car-following model, with only marginal compromise to travel efficiency. Furthermore, a sensitivity analysis is conducted, revealing the generalizability of the offline learning-based method across various simulation configurations and even diverse energy consumption models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call