Abstract

The game of GO has long been regarded as the most challenging game for artificial intelligence because of its enormous search space and the difficulty of evaluating its board positions. In early 2016, the defeat of Lee Sedol by AlphaGo became the milestone of artificial intelligence. AlphaGo’s success lies in that it efficiently combines policy and value networks with Monte Carlo tree search (MCTS). And these deep convolutional neural networks (DCNNs) are trained by the combination of supervised learning and reinforcement learning. However, large convolution operations are computationally-intensive and typically require a powerful computing platform, for example, a graphics processing unit (GPU). Therefore, it is challenging to apply DCCNs in resource-limited embedded systems. Field programmable gate array (FPGA) is proposed to be an appropriate solution to implement real-time DCCNs models. However, the limited bandwidth and on-chip memory storage are the bottlenecks for DCCNs acceleration. In this article, an AlphaGo Policy Network is designed, and efficient hardware architectures are proposed to accelerate the DCCN model. The accelerator can be fit into different FPGAs, providing the balancing between processing speed and hardware resources. As an example, the AlphaGo Policy Network is implemented on Xilinx design suite VCU118, and the results show that our implementation achieved a performance of 3036.32 GOPS and achieved up to 56x speedup compared to CPU and 22.4x speedup compared to GPU.

Highlights

  • INTRODUCTIONThe game of Go (weiqi in Mandarin pinyin) is a two-player board game tracing its origins to China more than 2,500 years ago

  • The game of Go is a two-player board game tracing its origins to China more than 2,500 years ago

  • Different from graphics processing unit (GPU) or other Field programmable gate array (FPGA) works which transform convolution to matrix multiplication, in this article, we present a pipelined Policy Network deep convolutional neural networks (DCNNs) model which is adopted by AlphaGo

Read more

Summary

INTRODUCTION

The game of Go (weiqi in Mandarin pinyin) is a two-player board game tracing its origins to China more than 2,500 years ago. Different from GPU or other FPGA works which transform convolution to matrix multiplication, the DCNN proposed is designed according to the original definition of convolutional neural network, and a pipelined architecture is designed to calculate 8 pixels of output feature map in parallel. For the input of a two-dimensional matrix, a block fetch module is designed to perform data caching of the convolution feature map and to convert the data into multiple data blocks. In this case, the data stream is input by rows, from left to right, and from up to bottom. In this layer, there is no ReLU function, and the input data is the output of the 13th convolutional layer. 32-bit floating-point data is used directly in softmax layer

SOFTMAX MODULE
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.