Abstract
Big data and machine learning applications are posing steadily increasing challenges to the used compute platforms in terms of performance and energy efficiency. In this paper we utilize the highly scalable heterogeneous server platform RECS for evaluation of a wide variety of hardware platforms ranging from general purpose CPUs via ARM-based SoCs to GPGPUs and FPGAs. The self-organizing map, a popular neural network model for unsupervised clustering and dimensionality reduction, is used as a typical example for machine learning applications in the big data domain. Optimized implementations of the algorithm have been developed for each of the target architectures. An in-depth analysis of the achieved performance and energy efficiency for a wide variety of application parameters shows that no single architecture performs best in terms of energy efficiency for the complete design space. In our study, ARM-based SoCs achieved the highest efficiency for small network sizes while FPGAs and GPGPUs perform best for large data sets. Compared to an implementation based on the Matlab SOM toolbox, our optimized multi-threaded CPU implementation achieves two orders of magnitude higher performance and energy efficiency. Large simulations especially benefit from the FPGA implementation, which outperforms the optimized CPU implementation by a factor of 220 and provides a 28-times higher energy efficiency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.