Online Residential Demand Response via Contextual Multi-Armed Bandits

Xin Chen,Na Li,Yutong Nie

doi:10.1109/lcsys.2020.3003190

Xin Chen, Na Li + Show 1 more

Open Access

https://doi.org/10.1109/lcsys.2020.3003190

Copy DOI

Journal: IEEE Control Systems Letters	Publication Date: Jun 2, 2020
Citations: 43	License type: publisher-specific, author manuscript

Affiliation: Harvard University, Zhejiang University

Abstract

Residential loads have great potential to enhance the efficiency and reliability of electricity systems via demand response (DR) programs. One major challenge in residential DR is how to learn and handle unknown and uncertain customer behaviors. In this letter, we consider the residential DR problem where the load service entity (LSE) aims to select an optimal subset of customers to optimize some DR performance, such as maximizing the expected load reduction with a financial budget or minimizing the expected squared deviation from a target reduction level. To learn the uncertain customer behaviors influenced by various time-varying environmental factors, we formulate the residential DR as a contextual multi-armed bandit (MAB) problem, and develop an online learning and selection (OLS) algorithm based on Thompson sampling to solve it. This algorithm takes the contextual information into consideration and is applicable to complicated DR settings. Numerical simulations are performed to demonstrate the learning effectiveness of the proposed algorithm.

Full Text