In next generation Internet of Things (NG-IoT) networks, numerous pieces of information are aggregated from the user devices and sensor nodes to the local computing units for further computing to support high-level applications. Those multitudinous transmission demands have raised new challenges for current link scheduling protocols. The centralized link scheduling protocols are inappropriate in some large-scale NG-IoT scenarios. The previously distributed link scheduling uses the randomized transmission scheme to avoid interference, making it hard to utilize the bandwidth resources fully. The multi-agent machine learning (MAML) technique is a potential approach to finding the most optimal link scheduling strategy. At the same time, the over-large state space will take a long time for them to approach the optimal solution, which reduces the practicality of the MAML. To fully utilize the bandwidth resource and improve the efficiency of link scheduling, this paper studies a multi-agent reinforcement learning enabled link scheduling problem. Different from the conventional MAML techniques that randomly select a state to do their exploration, in our multi-agent reinforcement learning algorithm, a good state is firstly obtained within polynomial time steps by executing a distributed and randomized sub-algorithm. We say a state is good if it is not far from the optimal state. Then, our multi-agent reinforcement learning scheme starts from the good state and does its exploration with an ɛ greedy scheme, which significantly reduces the time steps to get close to the optimal link scheduling strategy. Extensive simulations are conducted to investigate the performance of our work.
Read full abstract