Abstract

Incentive-based demand response is playing an increasingly important role in ensuring the safe operation of the power grid and reducing system costs, and advances in information and communications technology have made it possible to implement it online. However, in regions where incentive-based demand response has never been implemented, the response behavior of customers is unknown, in this case, how to quickly and accurately set the incentive price is a challenge for service providers. This paper proposes a pricing method that combines long short-term memory networks and reinforcement learning to solve the pricing problem of service providers when the customers’ response behavior is unknown. Taking the total profit of all response time slots in one day as the optimization goal, long and short-term memory networks are used to learn the relationship between customers’ response behavior and incentive price, and reinforcement learning is used to explore and determine the optimal price. The results show that the combination of these two methods can perform virtual exploration of the optimal price, which solves the disadvantage that reinforcement learning can only rely on delayed rewards to perform exploration in the real scene, thereby speeding up the process of setting the optimal price. In addition, because the influence of the incentive prices combination of different time slots on the profit of the service provider is considered, the negative effect of myopia optimization is avoided.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call