Background: In recent years, with the development of the Internet of Vehicles, a variety of novel in-vehicle application devices have surfaced, exhibiting increasingly stringent requirements for time delay. Vehicular edge networks (VEN) can fully use network edge devices, such as roadside units (RSUs), for collaborative processing, which can effectively reduce latency. background: In recent years, with the development of the field of internet of vehicles, a variety of novel in-vehicle application devices have surfaced, exhibiting increasingly stringent requirements for time delay. Vehicular edge network (VEN) can make full use of network edge devices, such as road side unit (RSU) for collaborative processing, which can effectively reduce the latency. Objective: Most extant studies, including patents, assume that RSU has sufficient computing resources to provide unlimited services. But in fact, its computing resources will be limited with the increase in processing tasks, which will restrict the delay-sensitive vehicular applications. To solve this problem, a vehicle-to-vehicle computing task offloading method based on deep reinforcement learning is proposed in this paper, which fully considers the remaining available computational resources of neighboring vehicles to minimize the total task processing latency and enhance the offloading success rate. objective: A vehicle-to-vehicle computing task offloading method based on deep reinforce-ment learning is proposed in this paper, which fully considers the remaining available computa-tional resources of neighboring vehicles with the objective of minimizing the total task processing latency and enhancing the offloading success rate. Methods: In the multi-service vehicle scenario, the analytic hierarchy process (AHP) was first used to prioritize the computing tasks of user vehicles. Next, an improved sequence-to-sequence (Seq2Seq) computing task scheduling model combined with an attention mechanism was designed, and the model was trained by an actor-critic (AC) reinforcement learning algorithm with the optimization goal of reducing the processing delay of computing tasks and improving the success rate of offloading. A task offloading strategy optimization model based on AHP-AC was obtained on this basis. Results: The average latency and execution success rate are used as performance metrics to compare the proposed method with three other task offloading methods: only-local processing, greedy strategy- based algorithm, and random algorithm. In addition, experimental validation in terms of CPU frequency and the number of SVs is carried out to demonstrate the excellent generalization ability of the proposed method. result: The average latency and execution success rate are used as performance metrics to compare the proposed method with three other task offloading methods: only-local processing, greedy strate-gy-based algorithm and random algorithm. In addition, experimental validation in terms of both CPU frequency and the number of SVs is carried out to demonstrate the good generalization abil-ity of the proposed method. Conclusion: The simulation results reveal that the proposed method outperforms other methods in reducing the processing delay of tasks and improving the success rate of task offloading, which solves the problem of limited execution of delay-sensitive tasks caused by insufficient computational resources.