Recently, mobile operators have been shifting to an intelligent autonomous network paradigm, where the mobile networks are automated in a plug-and-play manner to reduce the manual intervention. Under this circumstance, serious inter-cell interference becomes inevitable which may severely deteriorate system throughput performance and users’ quality of service (QoS), especially for dense residential small base station (SBS) deployment. This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. We propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">policy</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">value</i> . Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.
Read full abstract