Abstract
K-means is one of the most commonly used clustering algorithms, with diverse scope for implementation in the signal processing, artificial intelligence and image processing fields, among others. Different variations and improvements of K-means exist, with kernel K-means being the most famous. K-means has been the subject of many studies aiming to improve its hardware and software implementations. Several of these studies have focused on the parallelization of K-means. Kernelization mainly transforms the data into a feature space of high dimensionality by computing the inner product between each possible data pair. As a result of kernelization, kernel K-means involves several computational steps and has additional computational requirements. As a result, kernel K-means has not seen the same interest and much can still be done in terms of its parallelized and robust implementations. This original research studies and develops different parallel implementations of kernel K-means on both the CPU and the GPU. The proposed CPU implementations use OpenMP and BLAS, while for the developed GPU implementation, the concentration is on CUDA available on Nvidia GPUs. Several datasets of a varying number of features and patterns are used. The results show that CUDA generally provides the best run-times with speedups varying between two to more than two hundred times over a single-core CPU implementation according to the used dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.