Implementation of Decentralized Reinforcement Learning-Based Multi-Quadrotor Flocking

Pramod Abichandani,Christian Speck,Donald Bucci,Deepan Lobo,William Mcintyre

doi:10.1109/access.2021.3115711

Abstract

Enabling coordinated motion of multiple quadrotors is an active area of research in the field of small unmanned aerial vehicles (sUAVs). While there are many techniques found in the literature that address the problem, these studies are limited to simulation results and seldom account for wind disturbances. This paper presents the experimental validation of a decentralized planner based on multi-objective reinforcement learning (RL) that achieves waypoint-based flocking (separation, velocity alignment, and cohesion) for multiple quadrotors in the presence of wind gusts. The planner is learned using an object-focused, greatest mass, state-action-reward-state-action (OF-GM-SARSA) approach. The Dryden wind gust model is used to simulate wind gusts during hardware-in-the-loop (HWIL) tests. The hardware and software architecture developed for the multi-quadrotor flocking controller is described in detail. HWIL and outdoor flight tests results show that the trained RL planner can generalize the flocking behaviors learned in training to the real-world flight dynamics of the DJI M100 quadrotor in windy conditions.

Highlights

S MALL unmanned aerial vehicles are a growing class of vehicles that can perform complex tasks, especially in hard-to-reach areas
Most other publications provide an evaluation of their flocking approaches in the multi-sUAV simulation environments such as Ardupilot, Q-ground control, Gazebo, and ROS [34]–[36] or numerical simulation using Python and MATLAB [20], [22], [37]– [40] leaving a gap in the literature regarding the hardware/software approaches required for implementing flocking based motion planners in real-world outdoor flights
This study provides detailed discussions on the hardware/software implementation and validation of OF-GM-SARSA applied to a multi-sUAV system to learn flocking using HWIL and outdoor flight tests

Summary

INTRODUCTION

S MALL unmanned aerial vehicles (sUAVs) are a growing class of vehicles that can perform complex tasks, especially in hard-to-reach areas. Most other publications provide an evaluation of their flocking approaches in the multi-sUAV simulation environments such as Ardupilot, Q-ground control, Gazebo, and ROS [34]–[36] or numerical simulation using Python and MATLAB [20], [22], [37]– [40] leaving a gap in the literature regarding the hardware/software approaches required for implementing flocking based motion planners in real-world outdoor flights. In this work, we leverage our previously developed OF-GM-SARSA-based path planner for flight testing the coordinated motion of multiple quadrotors to reach waypoints while maintaining the flocking behaviors. 2) Experimental evaluation and validation of a decentralized OF-GM-SARSA based hardware/software architecture via outdoor flight tests involving up to 4 DJI M100 quadrotors operating in the presence of natural wind gusts.

RELATED WORK

STATE SPACE REPRESENTATION

ACTION SPACE REPRESENTATION

OF-GM POLICY

STATE EXPLORATION AND MODEL TRAINING

ON CONVERGENCE OF OF-GM-SARSA

SIMULATIONS USING DRYDEN MODEL

KEY EVALUATION METRICS

INTER-SUAV DISTANCES

COMMUNICATION PACKET LOSS

VELOCITY ALIGNMENT AND COHESION DEVIATIONS

Findings

DISCUSSION AND FUTURE

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Implementation of Decentralized Reinforcement Learning-Based Multi-Quadrotor Flocking

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Multi-objective safe reinforcement learning
Naoto Horie ... Atsuko Mutoh
Artificial Life and Robotics | VOL. -
Naoto Horie, et. al.Naoto Horie ... Atsuko Mutoh
18 Jan 2019
Artificial Life and Robotics | VOL. -

Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning
Man-Je Kim ... Hyunsoo Park
Electronics | VOL. 11
Man-Je Kim, et. al.Man-Je Kim ... Hyunsoo Park
28 Mar 2022
Electronics | VOL. 11

Disturbance-Rejection-Based Optimized Robust Adaptive Controllers for UAVs
Muhammad Kazim ... Ahmad Taher Azar
IEEE systems journal | VOL. 15
Muhammad Kazim, et. al.Muhammad Kazim ... Ahmad Taher Azar
13 Apr 2021
IEEE systems journal | VOL. 15

Hardware in the Loop Test for Power System Modeling and Simulation
Jian Wu ... Noel Schulz
-
Jian Wu, et. al.Jian Wu ... Noel Schulz
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementation of Decentralized Reinforcement Learning-Based Multi-Quadrotor Flocking

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions