Abstract

The two-part series of papers provides a survey on recent advances in Deep Reinforcement Learning (DRL) for solving partially observable Markov decision processes (POMDP) problems. Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment. The fact that the agent has limited access to the information of the environment enables AI to be applied efficiently in most fields that require self-learning. It’s essential to have an organized investigation—we can make good comparisons and choose the best structures or algorithms when applying DRL in various applications. The first part of the overview introduces Markov Decision Processes (MDP) problems and Reinforcement Learning and applications of DRL for solving POMDP problems in games, robotics, and natural language processing. In part two, we continue to introduce applications in transportation, industries, communications and networking, etc. and discuss the limitations of DRL.

Highlights

  • Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment

  • Navigation is a fundamental task in autonomous driving, and Deep Reinforcement Learning (DRL) has been proven to be effective in navigation problems: Fayjie et al [33] presented a Deep Q Networks (DQN)-based approach for navigation in the urban environment, and Isele et al [34] used a DQN-based method for navigating in occluded intersections

  • Mobile Edge Computing (MEC) is a promising technology to extend the services to the edge of the Internet of Things (IoT) system, and DRL has been successfully applied in the MEC networks in recent years [74,75,76]

Read more

Summary

Transportation

An intelligent transportation system (ITS) [1] is an application that aims to provide safe, efficient, and innovative services to transport and traffic management and construct more intelligent transport networks. The first format is an image-like representation called Discrete Traffic State Encoding (DTSE) It acquires high resolution and practical information from the intersection. Genders and Razavi [14] proposed the discrete traffic state encoding, which is informationdense, as the input to the DQN networks for traffic signal control agent (DQTSCA) and evaluated state representations from low to high-resolution using Asynchronous Advantage Actor Critic (A3C) in [15]. Xu et al [24] used a data-driven approach to find critical nodes, which can cause a reduction in traffic efficiency They introduced a policy gradient method on these nodes. In 2020, Haydari and Yilmaz [2] provided tables of outlines of single and multiple agent RL approaches for Traffic Signal Control (TSC), DRL methods for TSC, and DRL solutions for other ITS applications

Autonomous Driving
Other Applications in ITS
Industrial Applications
Smart Grid
Communications and Networking
Connected Vehicles
Resources Management
Healthcare
Education
Finance
Aerospace
Deep Reinforcement Learning Limitations
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call