Abstract

In Proof-of-Work (PoW) blockchain systems, miners can select the pool to join for maximizing their revenues, leading to the mining pool selection problem. A pool can infiltrate its mining power into other pools for only obtaining additional revenues without substantially contributing to their mining work. This is called Block WithHolding (BWH) attack and significantly affects the pool selection of miners. We therefore investigate the mining pool selection issue under the BWH attack in this paper. Previous studies rely on an arguable and impractical assumption that miners can observe the attack to calculate their payoffs. This paper however focuses on unobservable BWH attack and applies reinforcement learning (RL) techniques to analyze the intelligent pool selection of miners. We adopt three typical RL models, i.e., Q-Learning (QL), Deep Q Network (DQN) and Advantage Actor-Critic (A2C) to dynamically learn the optimal pool selection policies of an intelligent miner, and use a discrete-event simulator to measure the reward of the miner. Simulation results are also provided to demonstrate the learning performances of the three models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call