A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run

A Castelletti,F Pianosi,M Restelli

doi:10.1002/wrcr.20295

Abstract

[1] The operation of large-scale water resources systems often involves several conflicting and noncommensurable objectives. The full characterization of tradeoffs among them is a necessary step to inform and support decisions in the absence of a unique optimal solution. In this context, the common approach is to consider many single objective problems, resulting from different combinations of the original problem objectives, each one solved using standard optimization methods based on mathematical programming. This scalarization process is computationally very demanding as it requires one optimization run for each trade-off and often results in very sparse and poorly informative representations of the Pareto frontier. More recently, bio-inspired methods have been applied to compute an approximation of the Pareto frontier in one single run. These methods allow to acceptably cover the full extent of the Pareto frontier with a reasonable computational effort. Yet, the quality of the policy obtained might be strongly dependent on the algorithm tuning and preconditioning. In this paper we propose a novel multiobjective Reinforcement Learning algorithm that combines the advantages of the above two approaches and alleviates some of their drawbacks. The proposed algorithm is an extension of fitted Q-iteration (FQI) that enables to learn the operating policies for all the linear combinations of preferences (weights) assigned to the objectives in a single training process. The key idea of multiobjective FQI (MOFQI) is to enlarge the continuous approximation of the value function, that is performed by single objective FQI over the state-decision space, also to the weight space. The approach is demonstrated on a real-world case study concerning the optimal operation of the HoaBinh reservoir on the Da river, Vietnam. MOFQI is compared with the reiterated use of FQI and a multiobjective parameterization-simulation-optimization (MOPSO) approach. Results show that MOFQI provides a continuous approximation of the Pareto front with comparable accuracy as the reiterated use of FQI. MOFQI outperforms MOPSO when no a priori knowledge on the operating policy shape is available, while produces slightly less accurate solutions when MOPSO can exploit such knowledge.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run

Abstract

Talk to us

Similar Papers

More From: Water Resources Research

Lead the way for us

Journal: Water Resources Research	Publication Date: Jun 1, 2013
Citations: 96

Similar Papers

Learning Pareto Set for Multi-Objective Continuous Robot Control
Tianye Shu ... Hisao Ishibuchi
-
Tianye Shu, et. al.Tianye Shu ... Hisao Ishibuchi
01 Aug 2024
01 Aug 2024

Multi-objective reinforcement learning using sets of pareto dominating policies
...
Journal of Machine Learning Research | VOL. 15
, et. al. ...
01 Jan 2014
Journal of Machine Learning Research | VOL. 15

Decomposition Based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
Xiu Cheng ... Mengjie Zhang
-
Xiu Cheng, et. al.Xiu Cheng ... Mengjie Zhang
01 Jul 2018
01 Jul 2018

Multi-objective fitted Q-iteration: Pareto frontier approximation in one single run
Andrea Castelletti ... Francesca Pianosi
-
Andrea Castelletti, et. al.Andrea Castelletti ... Francesca Pianosi
01 Apr 2011
01 Apr 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run

Abstract

Talk to us

Similar Papers

More From: Water Resources Research