Abstract

State aggregation is usually used to handle large-scale Markov decision processes (MDPs). Despite of the computational advantage, state aggregation may result in error in estimating value functions of states and further lead to poor performance in objective value. Various cyber physical energy systems (CPES), including supply demand matching systems, are discrete event dynamic systems, which can usually be formulated as MDP. It is of great practical interest to study performance loss bound for state aggregation in large scale MDPs. In this paper, we consider the performance loss bound for state aggregation in a class of supply demand matching systems. These systems consist of two types of state variables, the action-based and the action-free. We provide a method for aggregating states, which reduces the size of state space and thus save memory space and computing budget. We make the following contributions. First, we provide the performance loss bounds for two sets of naive state aggregations, based on which we propose that the action-free variables are prior to be aggregated when the true value functions or Q-factors are unknown. Second, we propose a k-means based method for aggregating states considering the features of state variables. Third, we consider the problem of battery charging of shared electric vehicles (EVs) in smart grid and test the proposed algorithm. The results are consistent with the performance loss bounds and show that the proposed method performs well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call