Hadoop cluster with FPGA-based hardware accelerators for K-means clustering algorithm

Ching-Che Chung,Yu-Hsin Wang

doi:10.1109/icce-china.2017.7991036

Abstract

In this paper, the implementation of the K-means clustering algorithm on a Hadoop cluster with FPGA-based hardware accelerators is presented. The proposed design follows MapReduce programming model and uses Hadoop distribution file system (HDFS) for storing large dataset. The proposed FPGA-based hardware accelerator for speed up the K-means clustering algorithm is implemented on Xilinx VC707 evaluation boards (EVBs). There are four computers in the proposed Hadoop cluster, one computer is Master Node, and the other three computers are Slave Nodes. The Slave Nodes communicate with VC707 EVBs through Gigabit Ethernet. The experimental results show that for clustering 125 million three-dimensional input dataset, the proposed design can achieve 4× speedup than the Hadoop cluster without FPGA-based hardware accelerators.

Full Text