Abstract
This paper presents a supervised reinforcement learning (SRL)-based framework for longitudinal vehicle dynamics control of cooperative adaptive cruise control (CACC) system. A supervisor network trained by real driving data is incorporated into the actor-critic reinforcement learning approach. In the SRL training process, the actor and critic network are updated under the guidance of the supervisor and the gain scheduler. As a result, the training success rate is improved, and the driver characteristics can be learned by the actor to achieve a human-like CACC controller. The SRL-based control policy is compared with a linear controller in typical driving situations through simulation, and the control policies trained by drivers with different driving styles are compared using a real driving cycle. Furthermore, the proposed control strategy is demonstrated by a real vehicle-following experiment with different time headways. The simulation and experimental results not only validate the effectiveness and adaptability of the SRL-based CACC system, but also show that it can provide natural following performance like human driving.
Highlights
With the growing demand for transportation in modern society, concern about road safety and traffic congestion is increasing [1,2]
We propose an supervised reinforcement learning (SRL)-based framework for the longitudinal vehicle dynamics control of a cooperative adaptive cruise control (CACC) system
A supervisor trained by driving data from a human driver is introduced to guide of a CACC system
Summary
With the growing demand for transportation in modern society, concern about road safety and traffic congestion is increasing [1,2]. In [11], an optimal control strategy is employed to develop a synthesis strategy for both distributed controllers and the communication topology that guarantees string stability In another hand, CACC is more attractive than conventional autonomous ACC, because the system behavior is more responsive to changes in the preceding vehicle speed, thereby enabling shorter following gaps and enhancing traffic throughput, fuel economy, and road safety [12]. In [37], a parameterized batch RL algorithm for near-optimal longitudinal velocity tracking is proposed, in which parameterized feature vectors based on kernels are learned from collected samples to approximate the value functions and policies, and the effectiveness of the controller is validated on an autonomous vehicle platform.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have