The diverse load profile formation and utility preferences of multitype electricity users challenge real-time pricing (RTP) and welfare equilibrium. This paper designs an RTP strategy for smart grids. On the demand side, it constructs utility functions reflecting user characteristics and uses multi-agents for different user interests. Considering industrial users, small-scale microgrids, distributed generation, and battery energy storage systems are included. Based on supply and demand interest, a distributed online multi-agent reinforcement learning (RL) algorithm is proposed. A bi-level stochastic model in the Markov decision process framework optimizes the RTP strategy. Through information exchange, an adaptive pricing scheme balances interest and achieves optimal strategies. Simulation results confirm the effectiveness of the proposed method and algorithm in peak shaving and valley filling. Three load fluctuation scenarios are compared, showing the algorithm's adaptability. The findings reveal the potential of the RL-based bi-level pricing model in resource allocation and user benefits in smart grids. Innovations in user modeling, model construction, and algorithm application have theoretical and practical significance in the electricity market research.
Read full abstract