Abstract

In an autonomous vehicle, the lane following algorithm is an important component, which is a basic function of autonomous driving. However, the existing lane following system has a few shortcomings: first, the control method it adopts requires an accurate system model, and different vehicles have different parameters, which needs a lot of parameter calibration work. The second is that it may fail on road sections where the lateral acceleration requirements of vehicles are large, such as large curves. Third, its decision-making system is defined based on rules, which has disadvantages: it is difficult to formulate; human subjective factors cannot guarantee objectivity; coverage is difficult to guarantee. In recent years, the deep deterministic policy gradient (DDPG) algorithm has been widely used in the field of autonomous driving due to its strong nonlinear fitting ability and generalization performance. However, the DDPG algorithm has overestimated state action values and large cumulative errors, low training efficiency and other issues. Therefore, this paper improves the DDPG algorithm based on the double critic networks and priority experience replay mechanism. Then this paper proposes a lane following method based on this algorithm. Experiment shows that the algorithm can achieve excellent following results under various road conditions.

Highlights

  • Lane following is one of the most important autonomous driving subsystems

  • In order to solve this problem, the authors in [22] propose the deep deterministic policy gradient (DDPG) algorithm, which is an algorithm based on direct policy search that can directly output continuous action values, which is very suitable for continuous control environments

  • The angle with the lane axis is reduced by 40%, and the distance from the road centerline is reduced by 49%, indicating that the algorithm in this paper is processing the lane following task is significantly better than the DDPG algorithm

Read more

Summary

Introduction

Lane following is one of the most important autonomous driving subsystems. Only after successfully implementing the lane following function can other advanced subsystems of autonomous driving such as obstacle avoidance and car following be further developed [1]. In order to solve this problem, the authors in [22] propose the DDPG algorithm, which is an algorithm based on direct policy search that can directly output continuous action values, which is very suitable for continuous control environments. The author applied it to lane following and achieved good results in the TORCS environment. A project that requires a lot of manpower and material resources will cause much waste of resources For dealing with this problem, this paper proposes double critic networks and priority experience replay deep deterministic policy gradient (DCPER-DDPG) algorithm. This paper proposes a lane following algorithm architecture based on deep reinforcement learning; secondly, designs the reward function, exploration strategy, and improved DDPG algorithm; the algorithm proposed in this paper is tested and verified on the TORCS simulation platform

The execution of the algorithm is as follow: in Figure
Network
Critic Network Structure
Reward Function
Exploration
Double Critic Networks and Priority Experience Replay of DDPG Algorithm
Simulation Environment
Termination Condition Setting
Training
Number
Analysis of Comparative Results
Vehicle
Normalized
11. Schematic
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call