Abstract

A deep reinforcement learning approach for solving the quadrotor path following and obstacle avoidance problem is proposed in this paper. The problem is solved with two agents: one for the path following task and another one for the obstacle avoidance task. A novel structure is proposed, where the action computed by the obstacle avoidance agent becomes the state of the path following agent. Compared to traditional deep reinforcement learning approaches, the proposed method allows to interpret the training process outcomes, is faster and can be safely trained on the real quadrotor. Both agents implement the Deep Deterministic Policy Gradient algorithm. The path following agent was developed in a previous work. The obstacle avoidance agent uses the information provided by a low-cost LIDAR to detect obstacles around the vehicle. Since LIDAR has a narrow field-of-view, an approach for providing the agent with a memory of the previously seen obstacles is developed. A detailed description of the process of defining the state vector, the reward function and the action of this agent is given. The agents are programmed in python/tensorflow and are trained and tested in the RotorS/gazebo platform. Simulations results prove the validity of the proposed approach.

Highlights

  • Recent years are revealing an exponential growth on the research and applications on the deep reinforcement learning (DRL) field

  • A novel structure formed by two agents; the obstacle avoidance (OA) agent developed in this paper and the PF agent developed in [6], is proposed

  • The main novelty of the approach is that the OA agent computes an action that directly modifies the state of the PF agent, converting it into a cascade structure

Read more

Summary

Introduction

Recent years are revealing an exponential growth on the research and applications on the deep reinforcement learning (DRL) field. It has been applied to a large number of different computer science, engineering and control problems with outstanding results [1,2,3,4,5]. In [6] a DRL algorithm was implemented to solve the path following. (PF) problem with adaptive velocity for a quadrotor obtaining successful experimental results. The capabilities of this learning paradigm are taken further by implementing a deep reinforcement learning agent for solving the reactive obstacle avoidance problem. This agent is combined with the one developed in [6] configuring a path following and obstacle avoidance autonomous solution

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.