Proximal Policy Optimization Research Articles

With the growth of personalized demand and the continuous improvement in social productivity, the large-scale and few-variety centralized production model is gradually transitioning towards a personalized model of small batches and multiple varieties, which makes the manufacturing process of the job shop increasingly complex. Furthermore, disruptive events such as machinery failures and rush orders in the job shop increase the uncertainty and variability of the production environment. Traditional scheduling methods are usually based on fixed rules and heuristic algorithms, which are difficult to adapt to constantly changing production environments and demands. This may lead to inaccurate scheduling decisions and hinder the optimal allocation of job shop resources. To solve the dynamic job shop scheduling problem (JSP) more effectively, this paper proposes a Reinforcement Learning (RL) optimization algorithm integrating long short-term memory (LSTM) neural network and proximal policy optimization (PPO). It can dynamically adjust scheduling strategies according to the changing production environment, achieving comprehensive status awareness of the job shop environment to make optimal scheduling decisions. First, a state-aware network framework based on LSTM-PPO is proposed to achieve real-time perception of job shop state changes. Then, the state and action space of the job shop are described within the context of the state-aware network framework. Finally, an experimental environment is established to verify the algorithm’s effectiveness. Training the LSTM-PPO algorithm makes it feasible to achieve better performance than other scheduling methods. By comparing the initial planning time with the actual completion time of the rescheduling decision under different dynamic disturbances, the efficiency of the proposed algorithm is verified for the dynamic JSP

Read full abstract

In 2020, the transportation sector was the second largest source of carbon emissions in the UK and in Newcastle upon Tyne, responsible for about 33% of total emissions. To support the UK’s target of reaching net zero emissions by 2050, electric vehicles (EVs) are pivotal in advancing carbon-neutral road transportation. Optimal EV charging requires a better understanding of the unpredictable output from on-site renewable energy sources (ORES). This paper proposes an integrated EV fleet charging schedule with a proximal policy optimization method based on a framework for deep reinforcement learning. For the design of the reinforcement learning environment, mathematical models of wind and solar power generation are created. In addition, the multivariate Gaussian distributions derived from historical weather and EV fleet charging data are utilized to simulate weather and charging demand uncertainty in order to create large datasets for training the model. The optimization problem is expressed as a Markov decision process (MDP) with operational constraints. For training artificial neural networks (ANNs) through successive transition simulations, a proximal policy optimization (PPO) approach is devised. The optimization approach is deployed and evaluated on a real-world scenario comprised of council EV fleet charging data from Leicester, UK. The results show that due to the design of the rewards function and system limitations, the charging action is biased towards the time of day when renewable energy output is maximum (midday). The charging decision by reinforcement learning improves the utilization of renewable energy by 2–4% compared to the random charging policy and the priority charging policy. This study contributes to the reduction in battery charging and discharging, electricity sold to the grid to create benefits and the reduction in carbon emissions.

Read full abstract

Proximal Policy Optimization Research Articles

Related Topics

Articles published on Proximal Policy Optimization

Deep reinforcement learning-based spatio-temporal graph neural network for solving job shop scheduling problem

Spatio‐temporal dynamic navigation for electric vehicle charging using deep reinforcement learning

Reinforcement learning for dynamic pricing and capacity allocation in monetized customer wait-skipping services

Application of Reinforcement Learning in Controlling Quadrotor UAV Flight Actions

A novel reinforcement learning-based method for structure optimization

Enhancing stability and explainability in reinforcement learning with machine learning

Vehicle-To-Grid (V2G) Charging and Discharging Strategies of an Integrated Supply–Demand Mechanism and User Behavior: A Recurrent Proximal Policy Optimization Approach

Intelligent design of steel–concrete composite beams based on deep reinforcement learning

Simulation and Optimization of Automated Guided Vehicle Charging Strategy for U-Shaped Automated Container Terminal Based on Improved Proximal Policy Optimization

Using Deep Reinforcement Learning (DRL) for minimizing power consumption in Video-on-Demand (VoD) storage systems

Optimization of High-Frequency Trading Strategies Using Deep Reinforcement Learning

COPSA: a computation offloading strategy based on PPO algorithm and self-attention mechanism in MEC-empowered smart factories

Digital Twin‐Enabled Deep Reinforcement Learning for Safety‐Guaranteed Flocking Motion of UAV Swarm

Meta-learning and proximal policy optimization driven two-stage emergency allocation strategy for multi-energy system against typhoon disasters

Comparison of Empirical and Reinforcement Learning (RL)-Based Control Based on Proximal Policy Optimization (PPO) for Walking Assistance: Does AI Always Win?

Robust solar sail trajectories using proximal policy optimization

Proximal Policy Optimization with Population-based Variable Neighborhood Search Algorithm for Coordinating Photo-Etching and Acid-Etching Processes in Sustainable Storage Chip Manufacturing

Probing an LSTM-PPO-Based reinforcement learning algorithm to solve dynamic job shop scheduling problem

CIPPO: Contrastive Imitation Proximal Policy Optimization for Recommendation Based on Reinforcement Learning

Reinforcement Learning for EV Fleet Smart Charging with On-Site Renewable Energy Sources

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Proximal Policy Optimization Research Articles

Related Topics

Articles published on Proximal Policy Optimization

Deep reinforcement learning-based spatio-temporal graph neural network for solving job shop scheduling problem

Spatio‐temporal dynamic navigation for electric vehicle charging using deep reinforcement learning

Reinforcement learning for dynamic pricing and capacity allocation in monetized customer wait-skipping services

Application of Reinforcement Learning in Controlling Quadrotor UAV Flight Actions

A novel reinforcement learning-based method for structure optimization

Enhancing stability and explainability in reinforcement learning with machine learning

Vehicle-To-Grid (V2G) Charging and Discharging Strategies of an Integrated Supply–Demand Mechanism and User Behavior: A Recurrent Proximal Policy Optimization Approach

Intelligent design of steel–concrete composite beams based on deep reinforcement learning

Simulation and Optimization of Automated Guided Vehicle Charging Strategy for U-Shaped Automated Container Terminal Based on Improved Proximal Policy Optimization

Using Deep Reinforcement Learning (DRL) for minimizing power consumption in Video-on-Demand (VoD) storage systems

Optimization of High-Frequency Trading Strategies Using Deep Reinforcement Learning

COPSA: a computation offloading strategy based on PPO algorithm and self-attention mechanism in MEC-empowered smart factories

Digital Twin‐Enabled Deep Reinforcement Learning for Safety‐Guaranteed Flocking Motion of UAV Swarm

Meta-learning and proximal policy optimization driven two-stage emergency allocation strategy for multi-energy system against typhoon disasters

Comparison of Empirical and Reinforcement Learning (RL)-Based Control Based on Proximal Policy Optimization (PPO) for Walking Assistance: Does AI Always Win?

Robust solar sail trajectories using proximal policy optimization

Proximal Policy Optimization with Population-based Variable Neighborhood Search Algorithm for Coordinating Photo-Etching and Acid-Etching Processes in Sustainable Storage Chip Manufacturing

Probing an LSTM-PPO-Based reinforcement learning algorithm to solve dynamic job shop scheduling problem

CIPPO: Contrastive Imitation Proximal Policy Optimization for Recommendation Based on Reinforcement Learning

Reinforcement Learning for EV Fleet Smart Charging with On-Site Renewable Energy Sources