MARLISA

Jose R Vazquez-Canteli,Zoltan Nagy,Gregor Henze

doi:10.1145/3408308.3427604

Abstract

We demonstrate that multi-agent reinforcement learning (RL) controllers can cooperate to provide more effective load shaping in a model-free, decentralized, and scalable way with very limited sharing of anonymous information. Rapid urbanization, increasing electrification, the integration of renewable energy resources, and the potential shift towards electric vehicles create new challenges for the planning and control of energy systems in smart cities. Energy storage resources can help better align peaks of renewable energy generation with peaks of electricity consumption and flatten the curve of electricity demand. Model-based controllers, such as MPC, require developing models of the systems controlled, which is often not cost-effective or scalable. Model-free controllers, such as RL, have the potential to provide good control policies cost-effectively and leverage the use of historical data for training. However, it is unclear how RL algorithms can control a multitude of energy systems in a scalable coordinated way. In this paper, we introduce MARLISA, a controller that combines multi-agent RL with our proposed iterative sequential action selection algorithm for load shaping in urban energy systems. This approach uses a reward function with individual and collective goals, and the agents predict their own future electricity consumption and share this information with each other following a leader-follower schema. The RL agents are tested in four groups of nine simulated buildings, with each group located in a different climate. The buildings have diverse load and domestic hot water profiles, PV panels, thermal storage devices, heat pumps, and electric heaters. The agents are evaluated on the average of five normalized metrics: annual net electric consumption, 1 -- load factor, average daily peak demand, annual peak demand, and ramping. MARLISA achieves superior results over multiple independent/uncooperative RL agents using the same reward function. Our results outperformed a manually optimized rule-based controller (RBC) benchmark by reducing the average daily peak load by 15%, ramping by 35%, and increasing the load factor by 10%. A multi-year case study on real weather data shows that MARLISA significantly outperforms the RBC in within a year and converges in less than 2 years. Combining MARLISA and the RBC for the first year improves overall initial performance by learning from the RBC rather than random exploration.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MARLISA

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Online Antenna Tuning in Heterogeneous Cellular Networks With Deep Reinforcement Learning
Eren Balevi ... Jeffrey G Andrews
IEEE Transactions on Cognitive Communications and Networking | VOL. 5
Eren Balevi, et. al.Eren Balevi ... Jeffrey G Andrews
01 Dec 2019
IEEE Transactions on Cognitive Communications and Networking | VOL. 5

A multi-agent system integrating reinforcement learning, bidding and genetic algorithms
...
-
, et. al. ...
01 Dec 2003
01 Dec 2003

The Dynamics of Multi-Agent Reinforcement Learning
...
-
, et. al. ...
04 Aug 2010
04 Aug 2010

Review of the progress of communication-based multi-agent reinforcement learning
涵王 ... 扬俞
SCIENTIA SINICA Informationis | VOL. 52
涵王, et. al.涵王 ... 扬俞
01 May 2022
SCIENTIA SINICA Informationis | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MARLISA

Abstract

Talk to us

Similar Papers