Abstract

Clustering approaches are widely used methodologies to analyse large data sets. The K-means algorithm is well-known as a procedure too computational-intensive for the large data analytic problem. In this work, we focus on a parallel technique to reduce the execution time when the K-means is used to cluster large dataset. We exploit computational powerful of its design when the Graphic Processor Units (GPUs), a massively parallel architecture, is adopted. We optimize the proposed implementation to handle (i) the space limitation issue of GPUs; (ii) the host-device data transfer time. Experimental results, on real and synthetic data, show how our parallelization approach give good results in terms of execution time and speed-up.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.