Sequential Decision Problems Research Articles

Reinforcement learning (RL) has shown superior performance in solving sequential decision problems. In recent years, RL is gradually being used to solve unmanned driving collision avoidance decision-making problems in complex scenarios. However, ships encounter many scenarios, and the differences in scenarios will seriously hinder the application of RL in collision avoidance at sea. Moreover, the iterative speed of trial-and-error learning for RL in multi-ship encounter scenarios is slow. To solve this problem, this study develops a novel intelligent collision avoidance algorithm based on approximate representation reinforcement learning (AR-RL) to realize the collision avoidance of maritime autonomous surface ships (MASS) in a continuous state space environment involving interactive learning capability like a crew in navigation situation. The new algorithm uses an approximate representation model to deal with the optimization of collision avoidance strategies in a dynamic target encounter situation. The model is combined with prior knowledge and International Regulations for Preventing Collisions at Sea (COLREGs) for optimal performance. This is followed by a design of an online solution to a value function approximation model based on gradient descent. This approach can solve the problem of large-scale collision avoidance policy learning in static-dynamic obstacles mixed environment. Finally, algorithm tests were constructed though two scenarios (i.e., the coastal static obstacle environment and the static-dynamic obstacles mixed environment) using Tianjin Port as an example and compared with multiple groups of algorithms. The results show that the algorithm can improve the large-scale learning efficiency of continuous state space of dynamic obstacle environment by approximate representation. At the same time, the MASS can efficiently and safely avoid obstacles enroute to reaching its target destination. It therefore makes significant contributions to ensuring safety at sea in a mixed traffic involving both manned and MASS in near future.

Read full abstract

Recommender systems (RSs) have become an inseparable part of our everyday lives. They help us find our favorite items to purchase, our friends on social networks, and our favorite movies to watch. Traditionally, the recommendation problem was considered to be a classification or prediction problem, but it is now widely agreed that formulating it as a sequential decision problem can better reflect the user-system interaction. Therefore, it can be formulated as a Markov decision process (MDP) and be solved by reinforcement learning (RL) algorithms. Unlike traditional recommendation methods, including collaborative filtering and content-based filtering, RL is able to handle the sequential, dynamic user-system interaction and to take into account the long-term user engagement. Although the idea of using RL for recommendation is not new and has been around for about two decades, it was not very practical, mainly because of scalability problems of traditional RL algorithms. However, a new trend has emerged in the field since the introduction of deep reinforcement learning (DRL) , which made it possible to apply RL to the recommendation problem with large state and action spaces. In this paper, a survey on reinforcement learning based recommender systems (RLRSs) is presented. Our aim is to present an outlook on the field and to provide the reader with a fairly complete knowledge of key concepts of the field. We first recognize and illustrate that RLRSs can be generally classified into RL- and DRL-based methods. Then, we propose an RLRS framework with four components, i.e., state representation, policy optimization, reward formulation, and environment building, and survey RLRS algorithms accordingly. We highlight emerging topics and depict important trends using various graphs and tables. Finally, we discuss important aspects and challenges that can be addressed in the future.

Read full abstract

Sequential Decision Problems Research Articles

Related Topics

Articles published on Sequential Decision Problems

Distributed dynamic online learning with differential privacy via path-length measurement

Reinforcement Learning for Clinical Applications.

Role of reinforcement learning for risk-based robust control of cyber-physical energy systems.

OSTTD: Offloading of Splittable Tasks With Topological Dependence in Multi-Tier Computing Networks

Collision avoidance for autonomous ship using deep reinforcement learning and prior-knowledge-based approximate representation

Joint condition-based maintenance and spare provisioning policy for a K-out-of-N system with failures during inspection intervals

An electrical vehicle-assisted demand response management system: A reinforcement learning method

Online portfolio management via deep reinforcement learning with high-frequency data

Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation

Procedural- and Reinforcement-Learning-Based Automation Methods for Analog Integrated Circuit Sizing in the Electrical Design Space

A General Framework for Bandit Problems Beyond Cumulative Objectives

GUBS criterion: Arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on Expected Utility Theory

Estimating Link Flows in Road Networks With Synthetic Trajectory Data Generation: Inverse Reinforcement Learning Approach

Reinforcement learning for automatic detection of effective strategies for self-regulated learning

Hybrid dynamic optimal tracking control of hydraulic cylinder speed in injection molding industry process

Novel Adaptive Transmission Scheme for Effective URLLC Support in 5G NR: A Model-Based Reinforcement Learning Solution

An anytime algorithm for constrained stochastic shortest path problems with deterministic policies

STACoRe: Spatio-temporal and action-based contrastive representations for reinforcement learning in Atari

Sequential dynamic resource allocation in multi-beam satellite systems: A learning-based optimization method

Reinforcement Learning based Recommender Systems: A Survey

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sequential Decision Problems Research Articles

Related Topics

Articles published on Sequential Decision Problems

Distributed dynamic online learning with differential privacy via path-length measurement

Reinforcement Learning for Clinical Applications.

Role of reinforcement learning for risk-based robust control of cyber-physical energy systems.

OSTTD: Offloading of Splittable Tasks With Topological Dependence in Multi-Tier Computing Networks

Collision avoidance for autonomous ship using deep reinforcement learning and prior-knowledge-based approximate representation

Joint condition-based maintenance and spare provisioning policy for a K-out-of-N system with failures during inspection intervals

An electrical vehicle-assisted demand response management system: A reinforcement learning method

Online portfolio management via deep reinforcement learning with high-frequency data

Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation

Procedural- and Reinforcement-Learning-Based Automation Methods for Analog Integrated Circuit Sizing in the Electrical Design Space

A General Framework for Bandit Problems Beyond Cumulative Objectives

GUBS criterion: Arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on Expected Utility Theory

Estimating Link Flows in Road Networks With Synthetic Trajectory Data Generation: Inverse Reinforcement Learning Approach

Reinforcement learning for automatic detection of effective strategies for self-regulated learning

Hybrid dynamic optimal tracking control of hydraulic cylinder speed in injection molding industry process

Novel Adaptive Transmission Scheme for Effective URLLC Support in 5G NR: A Model-Based Reinforcement Learning Solution

An anytime algorithm for constrained stochastic shortest path problems with deterministic policies

STACoRe: Spatio-temporal and action-based contrastive representations for reinforcement learning in Atari

Sequential dynamic resource allocation in multi-beam satellite systems: A learning-based optimization method

Reinforcement Learning based Recommender Systems: A Survey