Recursive least mean p-power extreme learning machine (RLMP-ELM) is a newly proposed online machine learning algorithm and is able to provide a robust online prediction of the datasets with noises of different statistics. To further explore the proposed RLMP-ELM to be used in real-world embedded systems, a generic serial FPGA-based hardware architecture of RLMP-ELM is presented in this paper. The entire hardware architecture of RLMP-ELM includes three serial processing modules, which are implemented parameterizably and can be adapted for different application requirements. The hardware framework is in a serial fashion, but parallelization efforts are focused on the processes with high computing complexity by analysis of potential inter-task dependency. To overcome the limitation of memory bandwidth, the block RAM and ping-pong on-chip buffer are applied to improve the computational throughput. The validation experiments are performed through five datasets with different p values. Accuracy results show that our implementation on FPGA could achieve similar accuracy compared to 64-bit floating-point software implementation. We also report and compare hardware performance of our proposed architecture with other existing implementations. The results show that our hardware architecture offers the excellent balance among accuracy, logic occupation and hardware performance.
Read full abstract