A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites

Yongming He,Lining Xing,Yingwu Chen,Witold Pedrycz,Guohua Wu,Ling Wang

doi:10.1109/tsmc.2020.3020732

Abstract

We investigate a general solution based on reinforcement learning for the agile satellite scheduling problem. The core idea of this method is to determine a value function for evaluating the long-term benefit under a certain state by training from experiences, and then apply this value function to guide decisions in unknown situations. First, the process of agile satellite scheduling is modeled as a finite Markov decision process with continuous state space and discrete action space. Two subproblems of the agile Earth observation satellite scheduling problem, i.e., the sequencing problem and the timing problem are solved by the part of the agent and the environment in the model, respectively. A satisfactory solution of the timing problem can be quickly produced by a constructive heuristic algorithm. The objective function of this problem is to maximize the total reward of the entire scheduling process. Based on the above design, we demonstrate that Q-network has advantages in fitting the long-term benefit of such problems. After that, we train the Q-network by Q-learning. The experimental results show that the trained Q-network performs efficiently to cope with unknown data, and can generate high total profit in a short time. The method has good scalability and can be applied to different types of satellite scheduling problems by customizing only the constraints checking process and reward signals.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems

Lead the way for us

Journal: IEEE Transactions on Systems, Man, and Cybernetics: Systems	Publication Date: Sep 25, 2020
Citations: 64

Similar Papers

Probably Approximately Correct (PAC) exploration in reinforcement learning

-

01 Jan 2007
01 Jan 2007

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning
Abolfazl Lavaei ... Fabio Somenzi
-
Abolfazl Lavaei, et. al.Abolfazl Lavaei ... Fabio Somenzi
01 Apr 2020
01 Apr 2020

Reinforcement Learning for the Agile Earth-Observing Satellite Scheduling Problem
Adam Herrmann ... Hanspeter Schaub
IEEE Transactions on Aerospace and Electronic Systems | VOL. -
Adam Herrmann, et. al.Adam Herrmann ... Hanspeter Schaub
01 Jan 2023
IEEE Transactions on Aerospace and Electronic Systems | VOL. -

Knowledge-Guided Parallel Hybrid Local Search Algorithm for Solving Time-Dependent Agile Satellite Scheduling Problems
Yuyuan Shan ... Shi Cheng
Symmetry | VOL. 16
Yuyuan Shan, et. al.Yuyuan Shan ... Shi Cheng
28 Jun 2024
Symmetry | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems