Abstract
The two-part series of papers provides a survey on recent advances in Deep Reinforcement Learning (DRL) for solving partially observable Markov decision processes (POMDP) problems. Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment. The fact that the agent has limited access to the information of the environment enables AI to be applied efficiently in most fields that require self-learning. It’s essential to have an organized investigation—we can make good comparisons and choose the best structures or algorithms when applying DRL in various applications. The first part of the overview introduces Markov Decision Processes (MDP) problems and Reinforcement Learning and applications of DRL for solving POMDP problems in games, robotics, and natural language processing. In part two, we continue to introduce applications in transportation, industries, communications and networking, etc. and discuss the limitations of DRL.
Highlights
Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment
Navigation is a fundamental task in autonomous driving, and Deep Reinforcement Learning (DRL) has been proven to be effective in navigation problems: Fayjie et al [33] presented a Deep Q Networks (DQN)-based approach for navigation in the urban environment, and Isele et al [34] used a DQN-based method for navigating in occluded intersections
Mobile Edge Computing (MEC) is a promising technology to extend the services to the edge of the Internet of Things (IoT) system, and DRL has been successfully applied in the MEC networks in recent years [74,75,76]
Summary
An intelligent transportation system (ITS) [1] is an application that aims to provide safe, efficient, and innovative services to transport and traffic management and construct more intelligent transport networks. The first format is an image-like representation called Discrete Traffic State Encoding (DTSE) It acquires high resolution and practical information from the intersection. Genders and Razavi [14] proposed the discrete traffic state encoding, which is informationdense, as the input to the DQN networks for traffic signal control agent (DQTSCA) and evaluated state representations from low to high-resolution using Asynchronous Advantage Actor Critic (A3C) in [15]. Xu et al [24] used a data-driven approach to find critical nodes, which can cause a reduction in traffic efficiency They introduced a policy gradient method on these nodes. In 2020, Haydari and Yilmaz [2] provided tables of outlines of single and multiple agent RL approaches for Traffic Signal Control (TSC), DRL methods for TSC, and DRL solutions for other ITS applications
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.