Abstract

This study proposes a framework for automatic high-frequency transactions based on a deep reinforcement learning model - proximal policy optimization. The framework creatively regards the transaction process as actions and returns as awards to align with the idea of reinforcement learning. It compares price prediction policies including long short-term memory (LSTM) and multi-layer perception (MLP) by applying them to the real-time bitcoin price. Then an automatically generating transaction strategy is constructed building on PPO with LSTM as the static policy. The approach is able to trade bitcoins in a simulating environment with synchronous data and achieves []% higher benefit than that of the average market. It is demonstrated that the approach can earn excess returns and be expanded to other financial products.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.