Conventional approaches to network control and resource allocation by allocating dedicated spectrum resources and separate infrastructure for massive Internet of Things (IoT) network are cost-inefficient. The concept of space-terrestrial integrated network (STIN), which is one of the enabling technologies and architectures for future 6 G wireless networks, can provide a solution to the network control and resource allocation problems for massive IoTs. In this paper, a novel STIN based network control and resource allocation problem is proposed for massive IoTs and solved through state of the art hierarchical deep actor-critic networks (H-DAC). The massive IoT networks spread over urban vicinity where cooperation can be possible. This is leveraged to negotiate a joint policy for per unit spectrum which the IoT networks are willing to pay. Deep actor-critic based reinforcement learning (RL) is used in this paper to solve the joint network control and resource allocation problem which is modeled as a utility maximization problem. The RL based algorithms solve the problem of cost per unit spectrum for the federated cloud of IoT networks and the data rate assigned to each IoT network and IoT devices. The algorithm also decides whether to transmit either through the space network or the terrestrial network. We validate performance of our proposed H-DAC scheme by comparing it with results of variants of the actor-critic based RL. We show that through proper system state formulation and reward design, the proposed H-DAC scheme outperforms the reference schemes with different network parameters and metrics.
Read full abstract