Abstract

Hardware accelerator design of machine learning algorithms is of great significance since huge amount of data are generated continuously and the computation load keeps increasing fast. Traditional solutions which rely on general purpose processors like X86 CPUs and ARM embedded processors cannot achieve high computation performance and energy efficiency due to lack of adaptability to specific algorithms. Domain-specific hardware accelerator is a promising alternative solution since the hardware is specifically designed to deal with one type of problems, the computation performance and energy efficiency can outperform traditional solutions by more than one order of magnitude. In this paper, we have proposed a hardware accelerator for support vector machine based on high level synthesis. Through the proposed loop dependence analysis and batch processing method, the speedup can reach about 153x on Xilinx ZC706 with unroll factor of 160, which is very close to its limit. This hardware implementation can be used in data centers as a coprocessor to accelerate the data processing speed in many classification applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call