Abstract

We develop a method to learn a bio-inspired motion control policy using data collected from hawkmoths navigating in a virtual forest. A Markov Decision Process (MDP) framework is introduced to model the dynamics of moths and sparse logistic regression is used to learn control policy parameters from the data. The results show that moths do not favor detailed obstacle location information in navigation, but rely heavily on optical flow. Using the policy learned from the moth data as a starting point, we propose an actor-critic learning algorithm to refine policy parameters and obtain a policy that can be used by an autonomous aerial vehicle operating in a cluttered environment. Compared with the moths’ policy, the policy we obtain integrates both obstacle location and optical flow. We compare the performance of these two policies in terms of their ability to navigate in artificial forest areas. While the optimized policy can adjust its parameters to outperform the moth’s policy in each different terrain, the moth’s policy exhibits a high level of robustness across terrains.

Highlights

  • Moths and other animals are experts in navigating complex forest terrains [1, 2]

  • Many animals exhibit a remarkable ability to navigate in complex forest terrains

  • Can we learn their navigation strategy from observed flying trajectories? Further, can we refine these strategies to design UAV/drone navigation policies in dense cluttered terrains? To that end, we propose a method to analyze data from hawkmoth flight trajectories in a closed-loop virtual forest and extract the navigation control policy

Read more

Summary

Introduction

Moths and other animals are experts in navigating complex forest terrains [1, 2]. Recent work in [3] experimented with hawkmoths (Manduca sexta) playing “video games” of navigation. Compared with a distribution of trajectories that are randomized via resampling, [3] suggests that moths respond to the external stimuli and follow a deliberate, goal-directed navigation path Such behaviors could inspire novel data-driven algorithms for control of autonomous vehicles performing complex tasks, including collision avoidance [4], navigation [5] and Simultaneous Localization And Mapping (SLAM) [6]. Problems of this type are typically formulated as dynamic optimization problems which can, in principle, be solved by dynamic programming techniques [7]. Such methods do not lead to practical policies for problems of realistic size due to the well known curse of dimensionality (too many “states” a vehicle can be at and too many feasible control actions at each state)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call