Abstract

Automated data transfer and measurement between multiple devices are accomplished through Machine- to-machine (M2M) communications, which rely on zero or minimal human intervention. M2M communication offers a plethora of benefits and opportunities, including the ability to handle a wide range of data and large volumes, the ability to learn on their own, and better decision making. In spite of these advantages, M2M faces major challenges such as communication delay, data acquisition mismatching, the requirement of additional resources, and is highly susceptible to errors. To handle these challenges, in this work, we discuss various state-of-the-art deep reinforcement learning (DRL) algorithms. Deep Q-learning (DQN), dueling DQN, multi-step DQN, actor-critic (AC), advantage AC, REINFORCE, trustregion policy optimization (TRPO), and proximal policy optimization (PPO) algorithms are investigated.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.