Accelerated Reinforcement Learning Research Articles

Graph optimization problems (such as minimum vertex cover, maximum cut, traveling salesman problems) appear in many fields including social sciences, power systems, chemistry, and bioinformatics. Recently, deep reinforcement learning (DRL) has shown success in automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised the ability of reinforcement learning in solving large-scale graph optimization problems due to lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop RL4GO , a high-performance distributed-GPU DRL framework for solving graph optimization problems. RL4GO focuses on a class of computationally demanding RL problems, where both the RL environment and policy model are highly computation intensive. Traditional reinforcement learning systems often assume either the RL environment is of low time complexity or the policy model is small. In this work, we distribute large-scale graphs across distributed GPUs and use the spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of the spatial parallelism and data parallelism and show their differences. To support graph neural network (GNN) layers that take as input data samples partitioned across distributed GPUs, we design parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design a parallel graph environment to scale up all RL-environment-related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high-performance RL4GO training and inference algorithms in parallel. Furthermore, we propose two optimization techniques—replay buffer on-the-fly graph generation and adaptive multiple-node selection—to minimize the spatial cost and accelerate reinforcement learning. This work also conducts in-depth analyses of parallel efficiency and memory cost and shows that the designed RL4GO algorithms are scalable on numerous distributed GPUs. Evaluations on large-scale graphs show that (1) RL4GO training and inference can achieve good parallel efficiency on 192 GPUs, (2) its training time can be 18 times faster than the state-of-the-art Gorila distributed RL framework [ 34 ], and (3) its inference performance achieves a 26 times improvement over Gorila.

Read full abstract

In the context of intelligent manufacturing in the process industry, traditional model-based optimization control methods cannot adapt to the situation of drastic changes in working conditions or operating modes. Reinforcement learning (RL) directly achieves the control objective by interacting with the environment, and has significant advantages in the presence of uncertainty since it does not require an explicit model of the operating plant. However, most RL algorithms fail to retain transfer learning capabilities in the presence of mode variation, which becomes a practical obstacle to industrial process control applications. To address these issues, we design a framework that uses local data augmentation to improve the training efficiency and transfer learning (adaptability) performance. Therefore, this paper proposes a novel RL control algorithm, CBR-MA-DDPG, organically integrating case-based reasoning (CBR), model-assisted (MA) experience augmentation, and deep deterministic policy gradient (DDPG). When the operating mode changes, CBR-MA-DDPG can quickly adapt to the varying environment and achieve the desired control performance within several training episodes. Experimental analyses on a continuous stirred tank reactor (CSTR) and an organic Rankine cycle (ORC) demonstrate the superiority of the proposed method in terms of both adaptability and control performance/robustness. The results show that the control performance of the CBR-MA-DDPG agent outperforms the conventional PI and MPC control schemes, and that it has higher training efficiency than the state-of-the-art DDPG, TD3, and PPO algorithms in transfer learning scenarios with mode shift situations.

Read full abstract

Accelerated Reinforcement Learning Research Articles

Related Topics

Articles published on Accelerated Reinforcement Learning

Using Natural Language to Improve Hierarchical Reinforcement Learning in Games

Disassembly line optimization with reinforcement learning

Accelerated Reinforcement Learning via Dynamic Mode Decomposition

Reward shaping using directed graph convolution neural networks for reinforcement learning and games

A Distributed-GPU Deep Reinforcement Learning System for Solving Large Graph Optimization Problems

Accelerating Reinforcement Learning via Predictive Policy Transfer in 6G RAN Slicing

A heuristically accelerated reinforcement learning method for maintenance policy of an assembly line

Accelerating reinforcement learning with case-based model-assisted experience augmentation for process control

DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation With Differentiable Simulation

Photonic reinforcement learning based on optoelectronic reservoir computing

Learning Task-Distribution Reward Shaping with Meta-Learning

Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning

Collaborative Framework of Accelerating Reinforcement Learning Training with Supervised Learning Based on Edge Computing

DQL energy management: An online-updated algorithm and its application in fix-line hybrid electric vehicle

Accelerating reinforcement learning with a Directional-Gaussian-Smoothing evolution strategy

Heuristically accelerated reinforcement learning for channel assignment in wireless sensor networks

A Memory-Greedy Policy With Guaranteed Convergence for Accelerating Reinforcement Learning

Heuristically accelerated reinforcement learning for channel assignment in wireless sensor networks

Model accelerated reinforcement learning for high precision robotic assembly

Coaching: accelerating reinforcement learning through human-assisted approach

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Accelerated Reinforcement Learning Research Articles

Related Topics

Articles published on Accelerated Reinforcement Learning

Using Natural Language to Improve Hierarchical Reinforcement Learning in Games

Disassembly line optimization with reinforcement learning

Accelerated Reinforcement Learning via Dynamic Mode Decomposition

Reward shaping using directed graph convolution neural networks for reinforcement learning and games

A Distributed-GPU Deep Reinforcement Learning System for Solving Large Graph Optimization Problems

Accelerating Reinforcement Learning via Predictive Policy Transfer in 6G RAN Slicing

A heuristically accelerated reinforcement learning method for maintenance policy of an assembly line

Accelerating reinforcement learning with case-based model-assisted experience augmentation for process control

DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation With Differentiable Simulation

Photonic reinforcement learning based on optoelectronic reservoir computing

Learning Task-Distribution Reward Shaping with Meta-Learning

Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning

Collaborative Framework of Accelerating Reinforcement Learning Training with Supervised Learning Based on Edge Computing

DQL energy management: An online-updated algorithm and its application in fix-line hybrid electric vehicle

Accelerating reinforcement learning with a Directional-Gaussian-Smoothing evolution strategy

Heuristically accelerated reinforcement learning for channel assignment in wireless sensor networks

A Memory-Greedy Policy With Guaranteed Convergence for Accelerating Reinforcement Learning

Heuristically accelerated reinforcement learning for channel assignment in wireless sensor networks

Model accelerated reinforcement learning for high precision robotic assembly

Coaching: accelerating reinforcement learning through human-assisted approach