The growth of the urban population and the increase in e-commerce activities have resulted in challenges for last-mile delivery. On the other hand, electric vehicles (EVs) have been introduced to last-mile delivery as an alternative to fossil fuel vehicles. Electric vehicles (EVs) not only play a pivotal role in reducing greenhouse gas emissions and air pollution but also contribute significantly to the development of more energy-efficient and environmentally sustainable urban transportation systems. Within these dynamics, the Electric Vehicle Routing Problem (EVRP) has begun to replace the Vehicle Routing Problem (VRP) in last-mile delivery. While classic vehicle routing ignores fueling, both the location of charging stations and charging time should be included in the Electric Vehicle Routing Problem due to the long recharging time. This study addresses the Capacitated EVRP (CEVRP) with a novel Q-learning algorithm. Q-learning is a model-free reinforcement learning algorithm designed to maximize an agent’s cumulative reward over time by selecting optimal actions. Additionally, a new dataset is also published for the EVRP considering field constraints. For the design of the dataset, real geographical positions have been used, located in the province of Eskisehir, Türkiye. It also includes environmental information, such as streets, intersections, and traffic density, unlike classical EVRP datasets. Optimal solutions are obtained for each instance of the EVRP by using the mathematical model. The results of the proposed Q-learning algorithm are compared with the optimal solutions of the presented dataset. Test results show that the proposed algorithm provides remarkable advantages in obtaining routes in a shorter time for EVs.
Read full abstract