Abstract
The objective of drift counteraction optimal control (DCOC) problem is to compute an optimal control law that maximizes the expected time of violating specified system constraints. In this paper, we reformulate the DCOC problem as a reinforcement learning (RL) one, removing the requirements of disturbance measurements and prior knowledge of the disturbance evolution. The optimal control policy for the DCOC is then trained with RL algorithms. As an example, we treat the problem of adaptive cruise control, where the objective is to maintain desired distance headway and time headway from the lead vehicle, while the acceleration and speed of the host vehicle are constrained based on safety, comfort, and fuel economy considerations. An informed approximate Q-learning algorithm is developed with efficient training, fast convergence, and good performance. The control performance is compared with a heuristic driver model in simulation and superior performance is demonstrated.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have