Abstract

This article proposes a hierarchical learning architecture for safe data-driven control in unknown environments. We consider a constrained nonlinear dynamical system and assume the availability of state-input trajectories solving control tasks in different environments. In addition to task-invariant system state and input constraints, a parameterized environment model generates task-specific state constraints, which are satisfied by the stored trajectories. Our goal is to use these trajectories to find a safe and high-performing policy for a new task in a new, unknown environment. We propose using the stored data to learn generalizable control strategies. At each time step, based on a local forecast of the new task environment, the learned strategy consists of a target region in the state space and input constraints to guide the system evolution to the target region. These target regions are used as terminal sets by a low-level model predictive controller. We show how to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i)</i> design the target sets from past data and then <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ii)</i> incorporate them into a model predictive control scheme with shifting horizon that ensures safety of the closed-loop system when performing the new task. We prove the feasibility of the resulting control policy, and apply the proposed method to robotic path planning, racing, and computer game applications. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —This paper was motivated by the challenge of designing safe controllers for autonomous systems navigating through new environments. We consider scenarios where trajectory data from control tasks in different environments is available to the control designer. Possible applications include autonomous vehicles racing on new tracks or robotic manipulators performing tasks in the presence of new obstacles. Existing approaches to model-based control design for new environments generally use trajectory libraries, systematically adapting stored trajectories to the constraints of the new environment. This typically requires a priori knowledge of the entire task environment as well as resources to store and maintain the growing library. This paper suggests a new hierarchical control approach, in which stored trajectories are used to learn high-level strategies that can be applied while solving the new task. The strategies are learned offline, and only the parameterized strategy function needs to be stored for online control. Strategies only require knowledge of the nearby task environment, and provide navigation guidelines for the system. In this paper we show how to find such strategies from previous task data and how to integrate them into a low-level controller to safely and efficiently solve the new task. We also show how to adapt the modular framework as needed for a user’s desired application. Simulation experiments in robotic manipulator, autonomous vehicle, and computer game examples suggest that our approach can be used in a wide range of applications. In future research, we will address how to adapt the method for time-varying or stochastic environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call