Abstract
We demonstrate that multi-agent reinforcement learning (RL) controllers can cooperate to provide more effective load shaping in a model-free, decentralized, and scalable way with very limited sharing of anonymous information. Rapid urbanization, increasing electrification, the integration of renewable energy resources, and the potential shift towards electric vehicles create new challenges for the planning and control of energy systems in smart cities. Energy storage resources can help better align peaks of renewable energy generation with peaks of electricity consumption and flatten the curve of electricity demand. Model-based controllers, such as MPC, require developing models of the systems controlled, which is often not cost-effective or scalable. Model-free controllers, such as RL, have the potential to provide good control policies cost-effectively and leverage the use of historical data for training. However, it is unclear how RL algorithms can control a multitude of energy systems in a scalable coordinated way. In this paper, we introduce MARLISA, a controller that combines multi-agent RL with our proposed iterative sequential action selection algorithm for load shaping in urban energy systems. This approach uses a reward function with individual and collective goals, and the agents predict their own future electricity consumption and share this information with each other following a leader-follower schema. The RL agents are tested in four groups of nine simulated buildings, with each group located in a different climate. The buildings have diverse load and domestic hot water profiles, PV panels, thermal storage devices, heat pumps, and electric heaters. The agents are evaluated on the average of five normalized metrics: annual net electric consumption, 1 -- load factor, average daily peak demand, annual peak demand, and ramping. MARLISA achieves superior results over multiple independent/uncooperative RL agents using the same reward function. Our results outperformed a manually optimized rule-based controller (RBC) benchmark by reducing the average daily peak load by 15%, ramping by 35%, and increasing the load factor by 10%. A multi-year case study on real weather data shows that MARLISA significantly outperforms the RBC in within a year and converges in less than 2 years. Combining MARLISA and the RBC for the first year improves overall initial performance by learning from the RBC rather than random exploration.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.