Abstract
Electric water heaters represent 14% of the electricity consumption in residential buildings. An average household in the United States (U.S.) spends about USD 400–600 (0.45 ¢/L–0.68 ¢/L) on water heating every year. In this context, water heaters are often considered as a valuable asset for Demand Response (DR) and building energy management system (BEMS) applications. To this end, this study proposes a model-free deep reinforcement learning (RL) approach that aims to minimize the electricity cost of a water heater under a time-of-use (TOU) electricity pricing policy by only using standard DR commands. In this approach, a set of RL agents, with different look ahead periods, were trained using the deep Q-networks (DQN) algorithm and their performance was tested on an unseen pair of price and hot water usage profiles. The testing results showed that the RL agents can help save electricity cost in the range of 19% to 35% compared to the baseline operation without causing any discomfort to end users. Additionally, the RL agents outperformed rule-based and model predictive control (MPC)-based controllers and achieved comparable performance to optimization-based control.
Highlights
The comparison between reinforcement learning (RL) agents and other controllers was conducted for five days because the model predictive control (MPC)-based controllers require an optimization in each control time step
Longer simulation periods result in longer computational times
This paper presented an RL-based water heater control approach that aims to minimize the electricity cost of a water heater under a TOU electricity pricing policy
Summary
The use of fossil fuels continues to pose adverse environmental impacts on the ecosystem in terms of global warming and pollution Renewable energy sources, such as solar, wind, biofuels and hydro, are expected to play a key role in transforming the harmful carbon-intensive energy generation systems into the more sustainable ones [1]. If the characteristics of the environment are known, approaches such as model-based RL or dynamic programming (DP) can be taken. In these approaches, the RL agent learns or is given the characteristics of the environment, finds the optimal policy. The use of a state value function V π (s), as defined in Equation (1) is more applicable for model-based approaches [33]
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have