This paper investigates a hierarchical aerial computing system, where both HAPs and UAVs provision computation services for ground devices (GDs). Different from the existing works which ignored UAV task offloading to HAPs and suffered long transmission delay between HAPs and GDs, in our system, UAVs are responsible for collecting the tasks generated by GDs. Considering limited resources and constrained coverage, UAVs need to cooperatively allocate their resources (including spectrum, caching, and computing) to GDs. After collecting GD tasks, UAVs are allowed to offload part of these tasks to the HAP, in order to minimize task processing delay and then better satisfy GD delay requirement. Our objective is to maximize the amount of computed tasks while satisfying tasks heterogeneous QoS requirements through the joint optimization of UAV resource allocation and task offloading. To this end, a joint optimization problem is first formulated as a partially observable Markov decision process (POMDP) under the constraints of available resources, UAV energy, and collision avoidance. Then, we design a multi-agent proximal policy optimization (MAPPO)-based algorithm to solve the optimization problem. By introducing the centralized training with decentralized execution framework, UAVs acting as agents can cooperatively make decisions on ground devices association, resource allocation, and task offloading according to their local observations. In addition, state normalization and action mask are also adopted to improve training efficiency. Experiment results verify the efficiency of the proposed algorithm and the system performance is also analyzed by the numerical results.
Read full abstract