Distributed Reinforcement Learning for NOMA-Enabled Mobile Edge Computing

Zhong Yang,Yue Chen,Yuanwei Liu

doi:10.1109/iccworkshops49005.2020.9145457

Abstract

A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.

Full Text