Abstract
Field programmable gate arrays (FPGAs) have become widely prevalent in recent years as a great alternative to application-specific integrated circuits (ASIC) and as a potentially cheap alternative to expensive graphics processing units (GPUs). Introduced as a prototyping solution for ASIC, FPGAs are now widely popular in applications such as artificial intelligence (AI) and machine learning (ML) models that require processing data rapidly. As a relatively low-cost option to GPUs, FPGAs have the advantage of being reprogrammed to be used in almost any data-driven application. In this work, we propose an easily scalable and cost-effective cluster-based co-processing system using FPGAs for ML and AI applications that is easily reconfigured to the requirements of each user application. The aim is to introduce a clustering system of FPGA boards to improve the efficiency of the training component of machine learning algorithms. Our proposed configuration provides an opportunity to utilise relatively inexpensive FPGA development boards to produce a cluster without expert knowledge in VHDL, Verilog, or the system designs related to FPGA development. Consisting of two parts – a computer-based host application to control the cluster and an FPGA cluster connected through a high-speed Ethernet switch, allows the users to customise and adapt the system without much effort. The methods proposed in this paper provide the ability to utilise any FPGA board with an Ethernet port to be used as a part of the cluster and unboundedly scaled. To demonstrate the effectiveness of the proposed work, a two-part experiment to demonstrate the flexibility and portability of the proposed work – a homogeneous and heterogeneous cluster, was conducted with results compared against a desktop computer and combinations of FPGAs in two clusters. Data sets ranging from 60,000 to 14 million, including stroke prediction and covid-19, were used in conducting the experiments. Results suggest that the proposed system in this work performs close to 70% faster than a traditional computer with similar accuracy rates.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Parallel, Emergent and Distributed Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.