Abstract

In the context of today’s artificial intelligence, the volume of data is exploding. Although scaling distributed clusters horizontally to cope with the increasing demands on computing power for massive data processing is feasible. But the unlimited addition of nodes will lead to bloated cluster size. Most of the transistors in CPUs are used to build cache memory and control units, which are not efficient for computing operations of massive data processing. Currently, academia uses hardware devices such as GPU (Graphics Processing Unit), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) to accelerate deep learning, image processing, which require massive computational operations. The paper first discussed the advantages and technical requirements of FPGA acceleration based on the characteristics of the Spark cluster. Then the paper proposed the design of the FPGA-CPU heterogeneous acceleration platform, and introduced the base-two-FFT algorithm. Finally, the paper present and compared the computation time of the base-two-FFT algorithm before and after the acceleration. The results show that the heterogeneous cluster has a speedup ratio of about 1.79 times compared to the CPU cluster.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call