Abstract

Heterogeneous architectures have emerged as an effective solution to address the energy-efficiency challenges. This is particularly happening in data centers where the integration of FPGA hardware accelerators with general purpose processors such as big Xeon or little Atom cores introduces enormous opportunities to address the power, scalability, and energy-efficiency challenges of processing emerging applications, in particular in domain of big data. Therefore, the rise of hardware accelerators in data centers, raises several important research questions: What is the potential for hardware acceleration in MapReduce, a defacto standard for big data analytics? What is the role of processor after acceleration; whether big or little core is most suited to run big data applications post hardware acceleration? This paper answers these questions through methodical real-system experiments on state-of-the-art hardware acceleration platforms. We first present the implementation of four highly used big data applications in a heterogeneous CPU+FPGA architecture. We develop the MapReduce implementation of K-means, K nearest neighbor, support vector machine, and naive Bayes in a Hadoop Streaming environment that allows developing mapper functions in a non-Java based language suited for interfacing with FPGA based hardware accelerating environment. We present a full implementation of the HW+SW mappers on existing FPGA+core platform and evaluate how a cluster of CPUs equipped with FPGAs uses the accelerated mapper to enhance the overall performance of MapReduce. Moreover, we study how various parameters at the application, system, and architecture levels affect the performance and power-efficiency benefits of Hadoop streaming hardware acceleration. This analysis helps to better understand how presence of HW accelerators for Hadoop MapReduce, changes the choice of CPU, tuning optimization parameters, and scheduling decisions for performance and energy-efficiency improvement. The results show a promising speedup as well as energy-efficiency gains of upto 5.7× and 16× is achieved, respectively, in an end-to-end Hadoop implementation using a semi-automated HLS framework. Results suggest that HW+SW acceleration yields significantly higher speedup on little cores, reducing the performance gap between little and big cores after the acceleration. On the other hand, the energy-efficiency benefit of HW+SW acceleration is higher on the big cores, which reduces the energy-efficiency gap between little and big cores. Overall, the experimental results show that a low cost embedded FPGA platform, programmed using a semi-automated HW+SW co-design methodology, brings significant performance and energy-efficiency gains for Hadoop MapReduce computing in cloud-based architectures and significantly reduces the reliance on large number of big high-performance cores.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call