Abstract

We present a new machine learning algorithm for learning optimal feedback control policies to guide a robot to a goal in the presence of obstacles. Our method works by first reducing the problem of obstacle avoidance to a continuous state, action, and time control problem, and then uses efficient collocation methods to solve for an optimal feedback control policy. This formulation of the obstacle avoidance problem improves over standard approaches, such as potential field methods, by being resistant to local minima, allowing for moving obstacles, handling stochastic systems, and computing feedback control strategies that take into account the robot's (possibly non-linear) dynamics. In addition to contributing a new method for obstacle avoidance, our work contributes to the state-of-the-art in collocation methods for non-linear stochastic optimal control problems in two important ways: (1) we show that taking into account local gradient and second-order derivative information of the optimal value function at the collocation points allows us to exploit knowledge of the derivative information about the system dynamics, and (2) we show that computational savings can be achieved by directly fitting the gradient of the optimal value function rather than the optimal value function itself. We validate our approach on three problems: non-convex obstacle avoidance of a point-mass robot, obstacle avoidance for a 2 degree of freedom robotic manipulator, and optimal control of a non-linear dynamical system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call