Aggregation–Decomposition-Based Multi-Agent Reinforcement Learning for Multi-Reservoir Operations Optimization

Milad Hooshyar,S Jamshid Mousavi,Masoud Mahootchi,Kumaraswamy Ponnambalam

doi:10.3390/w12102688

Milad Hooshyar, S Jamshid Mousavi + Show 2 more

Open Access

https://doi.org/10.3390/w12102688

Copy DOI

Abstract

Stochastic dynamic programming (SDP) is a widely-used method for reservoir operations optimization under uncertainty but suffers from the dual curses of dimensionality and modeling. Reinforcement learning (RL), a simulation-based stochastic optimization approach, can nullify the curse of modeling that arises from the need for calculating a very large transition probability matrix. RL mitigates the curse of the dimensionality problem, but cannot solve it completely as it remains computationally intensive in complex multi-reservoir systems. This paper presents a multi-agent RL approach combined with an aggregation/decomposition (AD-RL) method for reducing the curse of dimensionality in multi-reservoir operation optimization problems. In this model, each reservoir is individually managed by a specific operator (agent) while co-operating with other agents systematically on finding a near-optimal operating policy for the whole system. Each agent makes a decision (release) based on its current state and the feedback it receives from the states of all upstream and downstream reservoirs. The method, along with an efficient artificial neural network-based robust procedure for the task of tuning Q-learning parameters, has been applied to a real-world five-reservoir problem, i.e., the Parambikulam–Aliyar Project (PAP) in India. We demonstrate that the proposed AD-RL approach helps to derive operating policies that are better than or comparable with the policies obtained by other stochastic optimization methods with less computational burden.

Highlights

Multi-reservoir optimization models are generally non-linear, non-convex, and large-scale in terms of the number of variables and constraints
This paper presents an Reinforcement learning (RL)-based model combined with an aggregation–decomposition (AD) approach that reduces the dimensionality problem to efficiently solve a stochastic multi-reservoir operation optimization problem
We presented the results of the proposed aggregation decomposition-reinforcement learning (AD-RL) approach for optimizing the Parambikulam–Aliyar Project (PAP) multi-reservoir system operations and compared them with those of three other stochastic optimization methods including MAM-dynamic programming (DP), FP, and Aggregation–Decomposition Dynamic Programming (AD-DP)

Summary

Introduction

Multi-reservoir optimization models are generally non-linear, non-convex, and large-scale in terms of the number of variables and constraints. Uncertainties in stochastic variables such as inflows, evaporation, and demands make it difficult to find even a sub-optimal operating policy. Two types of stochastic programming approach are used to optimize multi-reservoir systems operations under uncertainty, i.e., implicit and explicit. In implicit stochastic optimization (ISO), a large number of historical or synthetically generated sequences of random variables such as streamflow are generated as the input for a deterministic optimization model. Water 2020, 12, 2688 represent different aspects of the underlying stochastic process, such as spatial or temporal correlations among random variables involved in the process. Optimal operation policies are acquired by performing post-processing analysis on the outputs of the deterministic optimization model solved for different input sequences (samples)

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Water	Publication Date: Sep 25, 2020
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Aggregation–Decomposition-Based Multi-Agent Reinforcement Learning for Multi-Reservoir Operations Optimization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Water

Lead the way for us

Similar Papers

Curses, Tradeoffs, and Scalable Management: Advancing Evolutionary Multiobjective Direct Policy Search to Improve Water Reservoir Operations
Matteo Giuliani ... Emanuele Mason
Journal of Water Resources Planning and Management | VOL. 142
Matteo Giuliani, et. al.Matteo Giuliani ... Emanuele Mason
23 Jan 2016
Journal of Water Resources Planning and Management | VOL. 142

Sustainable Reservoir Operation and Control Using a Deep Reinforcement Learning Policy Gradient Method
Sadegh Sadeghi Tabas ... Vidya Samadi
-
Sadegh Sadeghi Tabas, et. al.Sadegh Sadeghi Tabas ... Vidya Samadi
28 Mar 2022
28 Mar 2022

Deriving optimal single-reservoir operating policies with reinforcement learning based approach incorporating uncertainties of demand and streamflow
Divya Upadhyay ... Sarth Dubey
-
Divya Upadhyay, et. al.Divya Upadhyay ... Sarth Dubey
15 May 2023
15 May 2023

Tree‐based reinforcement learning for optimal water reservoir operation
A Castelletti ... R Soncini‐Sessa
Water Resources Research | VOL. 46
A Castelletti, et. al.A Castelletti ... R Soncini‐Sessa
01 Sep 2010
Water Resources Research | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Aggregation–Decomposition-Based Multi-Agent Reinforcement Learning for Multi-Reservoir Operations Optimization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Water