As a key 5G technology, massive multiple-input multiple-output (MIMO) can effectively improve system capacity and reduce latency. This paper proposes a user scheduling and spectrum allocation method based on combinatorial multi-armed bandit (CMAB) for a massive MIMO system. Compared with traditional methods, the proposed CMAB-based method can avoid channel estimation for all users, significantly reduce pilot overhead, and improve spectral efficiency. Specifically, the proposed method is a two-stage method; in the first stage, we transform the user scheduling problem into a CMAB problem, with each user being referred to as a base arm and the energy of the channel being considered a reward. A linear upper confidence bound (UCB) arm selection algorithm is proposed. It is proved that the proposed user scheduling algorithm experiences logarithmic regret over time. In the second stage, by grouping the statistical channel state information (CSI), such that the statistical CSI of the users in the angular domain in different groups is approximately orthogonal, we are able to select one user in each group and allocate a subcarrier to the selected users, so that the channels of users on each subcarrier are approximately orthogonal, which can reduce the inter-user interference and improve the spectral efficiency. The simulation results validate that the proposed method has a high spectral efficiency.
Read full abstract