Machine learning is becoming the cornerstones of smart and autonomous systems. Machine learning algorithms can be categorized into supervised learning (classification) and unsupervised learning (clustering). Among many classification algorithms, the Support Vector Machine (SVM) classifier is one of the most commonly used machine learning algorithms. By incorporating convex optimization techniques into the SVM classifier, we can further enhance the accuracy and classification process of the SVM by finding the optimal solution. Many machine learning algorithms, including SVM classification, are compute-intensive and data-intensive, requiring significant processing power. Furthermore, many machine learning algorithms have found their way into portable and embedded devices, which have stringent requirements. In this research work, we introduce a novel, unique, and efficient Field Programmable Gate Array (FPGA)-based hardware accelerator for a convex optimization-based SVM classifier for embedded platforms, considering the constraints associated with these platforms and the requirements of the applications running on these devices. We incorporate suitable mathematical kernels and decomposition methods to systematically solve the convex optimization for machine learning applications with a large volume of data. Our proposed architectures are generic, parameterized, and scalable; hence, without changing internal architectures, our designs can be used to process different datasets with varying sizes, can be executed on different platforms, and can be utilized for various machine learning applications. We also introduce system-level architectures and techniques to facilitate real-time processing. Experiments are performed using two different benchmark datasets to evaluate the feasibility and efficiency of our hardware architecture, in terms of timing, speedup, area, and accuracy. Our embedded hardware design achieves up to 79 times speedup compared to its embedded software counterpart, and can also achieve up to 100% classification accuracy.
Read full abstract