Online Virtual Training in Soft Actor-Critic for Autonomous Driving

Maryam Savari,Yoonsuck Choe

doi:10.1109/ijcnn52387.2021.9533791

Abstract

Deep Reinforcement Learning (RL) algorithms are widely being used in autonomous driving due to their ability to cope with unseen environments. However, in a complex domain like autonomous driving, these algorithms need to explore the environment enough to be able to converge. Therefore, these algorithms are faced with the problem of long training times and large amounts of data. In addition, using deep RL algorithms in areas that safety is an important factor such as autonomous driving can lead to a safety issue since we cannot leave the car driving in the street unattended. In this research, we tested two methods for the purpose of reducing the training time. First, we pre-trained Soft Actor-Critic (SAC) with Learning from Demonstrations (LfD) to find out if pre-training can reduce the training time of the SAC algorithm. Then, an online end-to-end combination method of SAC, LfD, and Learning from Interventions (LfI) is proposed to train an agent (dubbed Online Virtual Training). Both scenarios were implemented and tested in an inverted-pendulum task in OpenAI gym and autonomous driving in the Carla simulator. The results showed a dramatic reduction in the training time and a significant increase in gaining rewards for Online LfD (33%) and Online Virtual training (36 %) as compare to the baseline SAC. The proposed approach is expected to be effective in daily commute scenarios for autonomous driving.

Full Text