With the increasing global demand for renewable energy and heightened environmental awareness, electric vehicles (EVs) are rapidly becoming a popular clean and efficient mode of transportation. However, the widespread adoption of EVs has presented several challenges, such as the lagging development of charging infrastructure, the impact on the power grid, and the dynamic changes in user charging behavior. To address these issues, this paper first proposes a vehicle-to-grid (V2G) optimization framework that responds to regional dynamic pricing. It also considers power balancing in charging and discharging stations when a large number of EVs are involved in scheduling, with the aim of maximizing the benefits for EV owners. Next, by leveraging the interaction between environmental states and the dynamic behavior of EVs, we design an optimization algorithm that combines the recurrent proximal policy optimization (RPPO) algorithm and long short-term memory (LSTM) networks. This approach enhances system convergence and improves grid stability while maximizing benefits for EV owners. Finally, a simulation platform is used to validate the practical application of the RPPO algorithm in optimizing V2G and grid-to-vehicle (G2V) charging strategies, providing significant theoretical foundations and technical support for the development of smart grids and sustainable transportation systems.