AlphaGo Policy Network: A DCNN Accelerator on FPGA

Zhenni Li,Can Zhu,Yu-Liang Gao,Jiao Wang,Ze-Kun Wang

doi:10.1109/access.2020.3023739

Abstract

The game of GO has long been regarded as the most challenging game for artificial intelligence because of its enormous search space and the difficulty of evaluating its board positions. In early 2016, the defeat of Lee Sedol by AlphaGo became the milestone of artificial intelligence. AlphaGo’s success lies in that it efficiently combines policy and value networks with Monte Carlo tree search (MCTS). And these deep convolutional neural networks (DCNNs) are trained by the combination of supervised learning and reinforcement learning. However, large convolution operations are computationally-intensive and typically require a powerful computing platform, for example, a graphics processing unit (GPU). Therefore, it is challenging to apply DCCNs in resource-limited embedded systems. Field programmable gate array (FPGA) is proposed to be an appropriate solution to implement real-time DCCNs models. However, the limited bandwidth and on-chip memory storage are the bottlenecks for DCCNs acceleration. In this article, an AlphaGo Policy Network is designed, and efficient hardware architectures are proposed to accelerate the DCCN model. The accelerator can be fit into different FPGAs, providing the balancing between processing speed and hardware resources. As an example, the AlphaGo Policy Network is implemented on Xilinx design suite VCU118, and the results show that our implementation achieved a performance of 3036.32 GOPS and achieved up to 56x speedup compared to CPU and 22.4x speedup compared to GPU.

Highlights

INTRODUCTIONThe game of Go (weiqi in Mandarin pinyin) is a two-player board game tracing its origins to China more than 2,500 years ago
The game of Go is a two-player board game tracing its origins to China more than 2,500 years ago
Different from graphics processing unit (GPU) or other Field programmable gate array (FPGA) works which transform convolution to matrix multiplication, in this article, we present a pipelined Policy Network deep convolutional neural networks (DCNNs) model which is adopted by AlphaGo

Summary

INTRODUCTION

The game of Go (weiqi in Mandarin pinyin) is a two-player board game tracing its origins to China more than 2,500 years ago. Different from GPU or other FPGA works which transform convolution to matrix multiplication, the DCNN proposed is designed according to the original definition of convolutional neural network, and a pipelined architecture is designed to calculate 8 pixels of output feature map in parallel. For the input of a two-dimensional matrix, a block fetch module is designed to perform data caching of the convolution feature map and to convert the data into multiple data blocks. In this case, the data stream is input by rows, from left to right, and from up to bottom. In this layer, there is no ReLU function, and the input data is the output of the 13th convolutional layer. 32-bit floating-point data is used directly in softmax layer

SOFTMAX MODULE

RESULTS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 36	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

AlphaGo Policy Network: A DCNN Accelerator on FPGA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Novel Casestudy and Benchmarking of AlexNet for Edge AI: From CPU and GPU to FPGA
Firas Al-Ali ... Farhad Mehdipour
-
Firas Al-Ali, et. al.Firas Al-Ali ... Farhad Mehdipour
30 Aug 2020
30 Aug 2020

RETRACTED ARTICLE: A novel cognitive Wallace compressor based multi operand adders in CNN architecture for FPGA
T Kowsalya
Journal of Ambient Intelligence and Humanized Computing | VOL. 12
T KowsalyaT Kowsalya
07 Aug 2020
Journal of Ambient Intelligence and Humanized Computing | VOL. 12

Real-time diabetic foot ulcer classification based on deep learning & parallel hardware computational tools
Mohammed A Fadhel ... Ye Duan
Multimedia Tools and Applications | VOL. 83
Mohammed A Fadhel, et. al.Mohammed A Fadhel ... Ye Duan
03 Feb 2024
Multimedia Tools and Applications | VOL. 83

Trajectory Optimization for VMAT Treatment Aided by Artificial Intelligence
P Dong ... L Xing
International Journal of Radiation Oncology*Biology*Physics | VOL. 99
P Dong, et. al.P Dong ... L Xing
23 Sep 2017
International Journal of Radiation Oncology*Biology*Physics | VOL. 99

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AlphaGo Policy Network: A DCNN Accelerator on FPGA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access