Abstract

Drones with obstacle avoidance capabilities have attracted much attention from researchers recently. They typically adopt either supervised learning or reinforcement learning (RL) for training their networks. The drawback of supervised learning is that labeling of the massive dataset is laborious and time-consuming, whereas RL aims to overcome such a problem by letting an agent learn with the data from its environment. The present study aims to utilize diverse RL within two categories: (1) discrete action space and (2) continuous action space. The former has the advantage in optimization for vision datasets, but such actions can lead to unnatural behavior. For the latter, we propose a U-net based segmentation model with an actor-critic network. Performance is compared between these RL algorithms with three different environments such as the woodland, block world, and the arena world, as well as racing with human pilots. Results suggest that our best continuous algorithm easily outperformed the discrete ones and yet was similar to an expert pilot.

Highlights

  • The drone is an unmanned aerial vehicle that has been around for a long time, and yet it has become a major research field recently

  • To better understand how these algorithms work in three environments, the trained network models ran in three environments such as the woodland, the block world, and the arena world, respectively

  • It is found that DD-Deep Q-Network (DQN) in discrete action space outperformed others, whereas ACKTR in continuous action space excelled others

Read more

Summary

Introduction

The drone is an unmanned aerial vehicle that has been around for a long time, and yet it has become a major research field recently. It seems that the recent success of the drone may come from the stable control of rotors and a bird’s eye view provided by a camera installed in front of it. One of the major research fields for the drone has been autonomous drone navigation. Controlling a drone with such dexterity involves solving many challenges in perception as well as in action using lightweight and yet high performing sensors.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call