Abstract

In the multi-agent device to device (D2D) communication networks, the scene of the multi-agent will change due to its mobility. To address communication interference and energy overconsumption problems caused by the lack of adaptability to changeful scenes, a power allocation algorithm based on scenes adaptive cooperative Q-learning (SACL) is proposed in the paper. Specifically, the scene variable is added into the state space, and the reward function in the algorithm is improved to achieve a larger system capacity with less power. Then, in order to improve the convergence speed of SACL algorithm, the balance factor is introduced based on the location distribution of multiple agents, and a fast scene adaptive reinforcement learning (FSACL) algorithm is proposed. Simulation experiments verify the adaptability of SACL and FSACL algorithm when the scene is changed. Compared with traditional cooperative Q-learning algorithm (CL) and independent Q-learning (IL) algorithms, the SACL and FSACL algorithm can obtain larger system capacity with smaller power to some extent. In addition, the FSACL algorithm converges faster than the CL, IL and the SACL algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.