Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization

Yaxiong Yuan,Sumei Sun,Bjorn Ottersten,Thang X Vu,Lei Lei,Symeon Chatzinotas

doi:10.1109/tvt.2021.3075860

Abstract

In unmanned aerial vehicle (UAV) applications, the UAV's limited energy supply and storage have triggered the development of intelligent energy-conserving scheduling solutions. In this paper, we investigate energy minimization for UAV-aided communication networks by jointly optimizing data-transmission scheduling and UAV hovering time. The formulated problem is combinatorial and non-convex with bilinear constraints. To tackle the problem, firstly, we provide an optimal algorithm (OPT) and a golden section search heuristic algorithm (GSS-HEU). Both solutions are served as offline performance benchmarks which might not be suitable for online operations. Towards this end, from a deep reinforcement learning (DRL) perspective, we propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm and develop a set of approaches to confine the action space. Compared to conventional RL/DRL, the novelty of AC-DSOS lies in handling two major issues, i.e., exponentially-increased action space and infeasible actions. Numerical results show that AC-DSOS is able to provide feasible solutions, and save around 25-30% energy compared to two conventional deep AC-DRL algorithms. Compared to the developed GSS-HEU, AC-DSOS consumes around 10% higher energy but reduces the computational time from second-level to millisecond-level.

Highlights

Unmanned aerial vehicles (UAVs) have attracted much attention to high-speed data transmission in dynamic, distributed, or plug-and-play scenarios, e.g., disaster rescue, live concert, or sports events [1]
We propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm for UAV energy savings, where the original problem is transformed into a Markov decision process (MDP)
Unlike conventional deep reinforcement learning (DRL), we develop a set of tailored approaches in AC-DSOS, e.g., stochastic policy quantification, action space reduction, and feasibility-guaranteed reward function design, to overcome DRL’s limitations in addressing combinatorial optimization problems with multiple constraints and large action space

Summary

INTRODUCTION

Deterministic optimization algorithms, e.g., [2]–[9], might not be suitable for fast decision making in a dynamic wireless environment To address this issue, deep learning-based solutions have been investigated in the literature. In [19], the authors employed deep actor-critic to design a learning algorithm for UAV-aided systems, considering energy efficiency and users’ fairness. Compared to offline optimization approaches, we provide online learning and timely energy-saving solutions based on DRL. We propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm for UAV energy savings, where the original problem is transformed into a Markov decision process (MDP). Unlike conventional DRL, we develop a set of tailored approaches in AC-DSOS, e.g., stochastic policy quantification, action space reduction, and feasibility-guaranteed reward function design, to overcome DRL’s limitations in addressing combinatorial optimization problems with multiple constraints and large action space. The codes for generating the results are online available at the link: https://github.com/ArthuretYuan

System Model

UAV’s Energy Model

PROBLEM FORMULATION

User-Timeslot Scheduling

Hovering Time Allocation

Algorithm Summary

Problem Reformulation

The AC-DSOS algorithm

NUMERICAL RESULTS

Parameter Settings

Results and Analysis

CONCLUSION

Objective

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Vehicular Technology	Publication Date: May 1, 2021
Citations: 32	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Vehicular Technology

Lead the way for us

Similar Papers

Weed control effect of unmanned aerial vehicle (UAV) application in wheat field
Yin Chen ... Yubin Lan
International Journal of Precision Agricultural Aviation | VOL. 1
Yin Chen, et. al.Yin Chen ... Yubin Lan
01 Jan 2018
International Journal of Precision Agricultural Aviation | VOL. 1

Coarse-to-Fine UAV Target Tracking With Deep Reinforcement Learning
Wei Zhang ... Xuewen Rong
IEEE Transactions on Automation Science and Engineering | VOL. 16
Wei Zhang, et. al.Wei Zhang ... Xuewen Rong
01 Oct 2019
IEEE Transactions on Automation Science and Engineering | VOL. 16

Legal regulation of unmanned aerial vehicles application in the surveillance of the state border of Ukraine
Serhii Khalymon ... Svitlana Hrynko
Revista Amazonia Investiga | VOL. 10
Serhii Khalymon, et. al.Serhii Khalymon ... Svitlana Hrynko
31 May 2021
Revista Amazonia Investiga | VOL. 10

Investigating UAVs applications and intention to use in the maritime shipping in Taiwan
Chung-Shan Yang
Maritime Policy & Management | VOL. 46
Chung-Shan YangChung-Shan Yang
19 Oct 2019
Maritime Policy & Management | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Vehicular Technology