Learning efficient navigation in vortical flow fields

Peter Gunnarson,John O Dabiri,Ioannis Mandralis,Guido Novati,Petros Koumoutsakos

doi:10.1038/s41467-021-27015-y

Peter Gunnarson, John O Dabiri + Show 3 more

Open Access

https://doi.org/10.1038/s41467-021-27015-y

Copy DOI

Abstract

Efficient point-to-point navigation in the presence of a background flow field is important for robotic applications such as ocean surveying. In such applications, robots may only have knowledge of their immediate surroundings or be faced with time-varying currents, which limits the use of optimal control techniques. Here, we apply a recently introduced Reinforcement Learning algorithm to discover time-efficient navigation policies to steer a fixed-speed swimmer through unsteady two-dimensional flow fields. The algorithm entails inputting environmental cues into a deep neural network that determines the swimmer’s actions, and deploying Remember and Forget Experience Replay. We find that the resulting swimmers successfully exploit the background flow to reach the target, but that this success depends on the sensed environmental cue. Surprisingly, a velocity sensing approach significantly outperformed a bio-mimetic vorticity sensing approach, and achieved a near 100% success rate in reaching the target locations while approaching the time-efficiency of optimal navigation trajectories.

Highlights

Efficient point-to-point navigation in the presence of a background flow field is important for robotic applications such as ocean surveying
We find that Deep Reinforcement Learning (RL) can discover timeefficient, robust paths through an unsteady, two-dimensional (2D) flow field using only local flow information, where simpler strategies such as swimming towards the target largely fail at the task
We have shown in this study how Deep RL can discover robust and time-efficient navigation policies which are improved by sensing local flow information

Summary

Introduction

Efficient point-to-point navigation in the presence of a background flow field is important for robotic applications such as ocean surveying. A single 128 × 128 deep neural network is used for the navigation policy, which accepts the swimmers state (i.e. flow information and relative position) and outputs the swimming direction as continuous variables. The ability of Deep RL to discover these effective navigation strategies depends on the type of local flow information included in the swimmer state.

Results

Conclusion