Improved Q-Learning Algorithm Based on Approximate State Matching in Agricultural Plant Protection Environment.

Fengjie Sun,Rui Zhang,Xianchang Wang

doi:10.3390/e23060737

Abstract

An Unmanned Aerial Vehicle (UAV) can greatly reduce manpower in the agricultural plant protection such as watering, sowing, and pesticide spraying. It is essential to develop a Decision-making Support System (DSS) for UAVs to help them choose the correct action in states according to the policy. In an unknown environment, the method of formulating rules for UAVs to help them choose actions is not applicable, and it is a feasible solution to obtain the optimal policy through reinforcement learning. However, experiments show that the existing reinforcement learning algorithms cannot get the optimal policy for a UAV in the agricultural plant protection environment. In this work we propose an improved Q-learning algorithm based on similar state matching, and we prove theoretically that there has a greater probability for UAV choosing the optimal action according to the policy learned by the algorithm we proposed than the classic Q-learning algorithm in the agricultural plant protection environment. This proposed algorithm is implemented and tested on datasets that are evenly distributed based on real UAV parameters and real farm information. The performance evaluation of the algorithm is discussed in detail. Experimental results show that the algorithm we proposed can efficiently learn the optimal policy for UAVs in the agricultural plant protection environment.

Highlights

We take the history record of Unmanned Aerial Vehicle (UAV)’ actions as the current state, this leads to different states possibly having different lengths, and through experiments we find that the classic Q-learning algorithm and Deep Reinforcement Learning (DeepRL) algorithm are not suitable for solving problems in this environment
We proposed an Approximate State Matching Q-learning algorithm which can obtain the optimal policy for UAVs
We analyzed the performance of the proposed algorithm and proved its advantages over the classic Q-learning algorithm in the agricultural plant protection environment through theorems

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Deep Reinforcement Learning (DeepRL) [4] has gained remarkable achievements in many research areas such as physics-based animation, robotics, computer vision, and games It aims at finding an optimal policy that maximizes cumulative rewards and is quite suitable for solving problems with continuous and high dimensional states and actions [5]. We study the problem of forming a policy for UAVs through reinforcement learning in agricultural plant protection environment, which is used as an example to conduct research, and the model and conclusions obtained can be applied to other decision-making or reinforcement learning problems.

Background

Related Work

Reinforcement Learning

11: Return Q

Boltzman distribution:

Problem Description

Actions

Transition

Reward

Problem Solution

Approximate State Matching Q-Learning Algorithm

23: Return Sene

29: Return Svp

Analysis of Algorithms

Experiment

UAV Specifications

Farm Information

Data Simulation

Evaluation Indicator

Result

Conclusions and Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy (Basel, Switzerland)	Publication Date: Jun 11, 2021
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improved Q-Learning Algorithm Based on Approximate State Matching in Agricultural Plant Protection Environment.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)

Lead the way for us

Similar Papers

Task scheduling system for UAV operations in agricultural plant protection environment
Fengjie Sun ... Xianchang Wang
Journal of Ambient Intelligence and Humanized Computing | VOL. -
Fengjie Sun, et. al.Fengjie Sun ... Xianchang Wang
06 May 2020
Journal of Ambient Intelligence and Humanized Computing | VOL. -

Research on the Task Scheduling System for Agricultural Plant Protection UAV
Sun Fengjie ... Wang Xianchang
-
Sun Fengjie, et. al.Sun Fengjie ... Wang Xianchang
04 Aug 2019
04 Aug 2019

A New Optimization Method application to Agricultural Plant Protection UAV Scheduling
Sun Fengjie ... Wang Xianchang
-
Sun Fengjie, et. al.Sun Fengjie ... Wang Xianchang
01 Dec 2019
01 Dec 2019

Federated Reinforcement Learning-Based UAV Swarm System for Aerial Remote Sensing
Woonghee Lee
Wireless Communications and Mobile Computing | VOL. 2022
Woonghee LeeWoonghee Lee
29 Apr 2022
Wireless Communications and Mobile Computing | VOL. 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved Q-Learning Algorithm Based on Approximate State Matching in Agricultural Plant Protection Environment.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy (Basel, Switzerland)