Unmanned Aerial Vehicle (UAV) has become one of the most significant component in future wireless networks since its on-demand and cost-effective deployment. Meanwhile, Device-to-Device (D2D) communications with the nature of providing proximity-based service to improve network performance has emerged as an important feature in 5th generation cellular networks. However, the explosive growth of smart devices and bandwidth-hungry applications cause enormous energy consumption. Energy Harvesting (EH) as a potential solution of improving energy efficiency has shown great importance. Consequently, in this paper, we investigate the resource allocation problem in UAV-aided EH-powered D2D Cellular Networks (UAV-EH-DCNs). Our objective is to maximize the energy efficiency while guaranteeing the satisfaction of ground users (GUs). Owning to the non-convexity of the problem, we formulate the problem as a Markov decision process. Afterward, Deep Deterministic Policy Gradient (DDPG) is proposed to find the optimal strategy. Additionally, Long Short-Term Memory (LSTM) network is employed to facilitate the convergence speed by extracting the previous information of GUs satisfaction to determine the current resource allocation strategy. Numerical results adduce the validity of the proposed DDPG+LSTM algorithm as compared to the DDPG and deep Q-learning algorithms.
Read full abstract