In 2020, the transportation sector was the second largest source of carbon emissions in the UK and in Newcastle upon Tyne, responsible for about 33% of total emissions. To support the UK’s target of reaching net zero emissions by 2050, electric vehicles (EVs) are pivotal in advancing carbon-neutral road transportation. Optimal EV charging requires a better understanding of the unpredictable output from on-site renewable energy sources (ORES). This paper proposes an integrated EV fleet charging schedule with a proximal policy optimization method based on a framework for deep reinforcement learning. For the design of the reinforcement learning environment, mathematical models of wind and solar power generation are created. In addition, the multivariate Gaussian distributions derived from historical weather and EV fleet charging data are utilized to simulate weather and charging demand uncertainty in order to create large datasets for training the model. The optimization problem is expressed as a Markov decision process (MDP) with operational constraints. For training artificial neural networks (ANNs) through successive transition simulations, a proximal policy optimization (PPO) approach is devised. The optimization approach is deployed and evaluated on a real-world scenario comprised of council EV fleet charging data from Leicester, UK. The results show that due to the design of the rewards function and system limitations, the charging action is biased towards the time of day when renewable energy output is maximum (midday). The charging decision by reinforcement learning improves the utilization of renewable energy by 2–4% compared to the random charging policy and the priority charging policy. This study contributes to the reduction in battery charging and discharging, electricity sold to the grid to create benefits and the reduction in carbon emissions.