With the proliferation of mobile, handheld, and embedded devices, many applications such as data mining applications have found their way into these devices. However, mobile devices have stringent area and power limitations, high speed-performance, reduced cost, and time-to-market requirements. Furthermore, applications running on mobile devices are becoming more complex requiring high processing power. These design constraints pose serious challenges to the embedded system designers. In order to process the applications on mobile and embedded systems, effectively and efficiently, optimized hardware architectures are needed. We are investigating the utilization of FPGA-based customized hardware to accelerate embedded data mining applications including handwritten analysis and facial recognition. For these biometric applications, Principal Component Analysis (PCA) is applied initially, followed by similarity measure. In this research work, we introduce novel and efficient embedded hardware architectures to accelerate the PCA computation. PCA is a classic technique to reduce the dimensionality of data by transforming the original data set into a new set of variables called Principal Components (PCs) that represent the key features of the data. We propose two hardware versions for PCA computation, each with its unique optimization techniques to enhance the performance of our designs, and one specifically with additional techniques to reduce the memory access latency of embedded platforms. To the best of our knowledge, we could not find similar work for PCA, specifically catered to the embedded devices, in the published literature. We perform experiments to evaluate the feasibility and efficiency of our designs using a benchmark dataset for biometrics. Our embedded hardware designs are generic, parameterized, and scalable; and achieve 78 times speedup as compared to its software counterparts.
Read full abstract