Batch Mode Reinforcement Learning Research Articles

Low voltage distribution networks deliver power to the last mile of the network, but are often legacy assets from a time when low carbon technologies, e.g., electrified heat, storage, and electric vehicles, were not envisaged. Furthermore, exploiting emerging data from distribution networks to provide decision support for adapting planning and operational strategies with system transitions presents a challenge. To overcome these challenges, this paper proposes a novel application of digital twins based reinforcement learning to improve decision making by a distribution system operator, with key metrics of predictability, responsiveness, interoperability, and automation. The power system states, i.e., network configurations, technological combinations, and load patterns, are captured via a convolutional neural network, chosen for its pattern recognition capability with high-dimensional inputs. The convolutional neural networks are iteratively trained through the fitted Q-iteration algorithm, as a batch mode reinforcement learning, to adapt the planning and operational decisions with the dynamic system transitions. Case studies demonstrate the effectiveness of the proposed model by reducing 50% of the investment cost when the system transitions towards the winter and maintaining the power loss and loss of load within 5% compared to the benchmark optimisation. Doubled power consumption was observed in winter under future energy scenarios due to the electrification of heat. The trained model can accurately adapt optimal decisions according to the system changes while reducing the computational time of solving optimisation problems, for a range of scales of distribution systems, demonstrating its potential for scalable deployment by a system operator.

Read full abstract

As a class of batch-mode reinforcement learning (RL) methods for Markov decision problems with large or continuous state spaces, approximate policy iteration (API) has received increasing attention in the past decades. One open problem in the design of API algorithms is how to construct the basis functions or features for value function approximation (VFA). In this paper, we propose a novel batch-mode RL approach with randomly projected features for VFA. The proposed approach can be viewed as an extension of extreme learning machines (ELMs) to RL problems so it can be called ELM-API. The ELMs have been popularly studied in supervised learning problems, but there is not much work on the extension of ELMs to learning control problems. The proposed approach has advantages over the previous API algorithms in that the features for VFA can be quickly generated without complex parameter selection and the performance will be adaptive to different sample sets in batch-mode RL. In particular, the ELM-API approach can realize fast and efficient feature reconstruction when training sample sets are relatively small. Comprehensive simulation studies on two benchmark learning control problems were carried out to test the performance of API algorithms with different feature construction methods. It is shown that the ELM-API algorithm can obtain comparable or better performance than the previous API approaches. To further show the effectiveness of ELM-API in real-world applications, the simulation results on a more challenging high-dimensional lane-changing decision problem in dynamic traffic environment are also reported, which show the capability of the ELM-API algorithm in learning satisfactory lane-changing policies with high data efficiency.

Read full abstract

Batch Mode Reinforcement Learning Research Articles

Related Topics

Articles published on Batch Mode Reinforcement Learning

Digital twin based reinforcement learning for extracting network structures and load patterns in planning and operation of distribution systems

Kernel-Based Reinforcement Learning on Representative States

Efficient Batch-Mode Reinforcement Learning Using Extreme Learning Machines

Batch Mode TD($\lambda$ ) for Controlling Partially Observable Gene Regulatory Networks.

Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market

Trajectory-Based Supplementary Damping Control for Power System Electromechanical Oscillations

Planning the Optimal Operation of a Multioutlet Water Reservoir with Water Quality and Quantity Targets

Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

Batch mode reinforcement learning based on the synthesis of artificial trajectories

Tree-Based Batch Mode Reinforcement Learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Batch Mode Reinforcement Learning Research Articles

Related Topics

Articles published on Batch Mode Reinforcement Learning

Digital twin based reinforcement learning for extracting network structures and load patterns in planning and operation of distribution systems

Kernel-Based Reinforcement Learning on Representative States

Efficient Batch-Mode Reinforcement Learning Using Extreme Learning Machines

Batch Mode TD($\lambda$ ) for Controlling Partially Observable Gene Regulatory Networks.

Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market

Trajectory-Based Supplementary Damping Control for Power System Electromechanical Oscillations

Planning the Optimal Operation of a Multioutlet Water Reservoir with Water Quality and Quantity Targets

Min Max Generalization for Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

Batch mode reinforcement learning based on the synthesis of artificial trajectories

Tree-Based Batch Mode Reinforcement Learning