Abstract

Advances in edge computing significantly impact the development of mobile networks. As the most important research goal related to edge networks, resource orchestration has been well studied in recent years; however, existing approaches based on deep reinforcement learning share similar bottlenecks in training inefficiency. In this paper, we treat drones, whose available time is significantly limited by their batteries, as the mobile terminals of a target edge network and aim to maximize the energy efficiency. The battery-constrained resource orchestration problem is formulated as a nonconvex optimization problem with consideration of both operating costs and available battery. Owing to the NP-hard nature of mixed-integer programming, the Auxiliary-Task-based dynamic Weighting Resource Orchestration (ATWRO) algorithm is proposed. To improve the sample efficiency, related parameters serving as auxiliary tasks are employed to provide additional gradient information. We further refine the exploration space and apply an alternative replay buffer to develop a customized reinforcement learning approach. Extensive experiments demonstrate the effectiveness of the proposed scheme, as they prove that by employing auxiliary tasks, reinforcement learning agents can be trained with higher efficiency. Moreover, the service time of the whole system can be prolonged, and a higher number of completed tasks can be guaranteed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call