Abstract

This paper investigates the use of Deep Reinforcement Learning (DRL) to control a profit-seeking storage device trading in the European Continuous Intra-day Electricity Market (CIM). The main objective is to study whether model-free DRL can profitably trade on the CIM. Two DRL agents are compared: Twin Delayed Deep Deterministic Policy Gradients (TD3), and TD3 with behavior cloning. The agents are trained and evaluated in a simulated CIM environment, which uses historical market data to simulate other market participants. A Rolling Intrinsic (RI) algorithm is used as a benchmark. Results indicate that the agents are profitable and occasionally outperform RI, in one instance obtaining 162.03% of RI profit. However, none of the agents consistently outperforms the baseline. These results suggest that DRL has the potential to increase profitability considerably compared to RI, but that the observation provided to the agent is not descriptive enough of the CIM to learn a robust policy. Future research could use additional features in the observation, or model-based DRL to improve performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call