Temporal Parallelization of Dynamic Programming and Linear Quadratic Control

Simo Sarkka,Angel F Garcia-Fernandez

doi:10.1109/tac.2022.3147017

Abstract

This article proposes a general formulation for temporal parallelization of dynamic programming for optimal control problems. We derive the elements and associative operators to be able to use parallel scans to solve these problems with logarithmic time complexity rather than linear time complexity. We apply this methodology to problems with finite state and control spaces, linear quadratic tracking control problems, and to a class of nonlinear control problems. The computational benefits of the parallel methods are demonstrated via numerical simulations run on a graphics processing unit.

Highlights

O PTIMAL control theory is concerned with designing control signals to steer a system such that a given cost function is minimised, or equivalently, a performance measure is maximised
In the form first introduced by Bellman 1950’s, is a general method for determining feedback laws for optimal control and other sequential decision problems [3], [8]–[10], and it forms the basis of reinforcement learning [7], which is a subfield of machine learning
The classic dynamic programming algorithm is a sequential procedure that proceeds backwards from the final time step to the initial time step, and determines the value function as well as the optimal control law in time complexity of O(T ), where T is the number of time steps

Summary

Introduction

O PTIMAL control theory (see, e.g., [1]–[3]) is concerned with designing control signals to steer a system such that a given cost function is minimised, or equivalently, a performance measure is maximised. In Bellman’s dynamic programming [3], [8], [9] the idea is to form a cost-to-go or value function Vk(xk) which gives the cost of the trajectory when we follow the optimal decisions for the remaining steps up to T starting from state xk. It can be shown [8] that the value function admits the recursion

Methods

Results

Conclusion