Abstract

Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UAVs-assisted computation offloading paradigm, where a group of UAVs fly around, while providing value-added edge computing services. The complex computing missions are decomposed as some typical task-flows with inter-dependencies. By taking into consideration the inter-dependencies of the tasks, dynamic network states, and energy constraints of the UAVs, we formulate the average mission response time minimization problem and then model it as a Markov decision process. Specifically, each time a mission arrives or a task execution finishes, we should decide the target helper for the next task execution and the fraction of the bandwidth allocated to the communication. To separate the evaluation of the integrated decision, we propose multi-agent reinforcement learning (MARL) algorithms, where the target helper and the bandwidth allocation are determined by two agents. We design respective advantage evaluation functions for the agents to solve the multi-agent credit assignment challenge, and further extend the on-policy algorithm to off-policy. Simulation results show that the proposed MARL-based approaches have desirable convergence property, and can adapt to the dynamic environment. The proposed approaches can significantly reduce the average mission response time compared with other benchmark approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call