Abstract

This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call