Abstract

<p>Data mining is a combination technology for analyze a useful information from dataset using some technique such as classification, clustering, and etc. Clustering is one of the most used data mining technique these day. K-Means and K-Medoids is one of clustering algorithms that mostly used because it’s easy implementation, efficient, and also present good results. Besides mining important information, the needs of time spent when mining data is also a concern in today era considering the real world applications produce huge volume of data. This research analyzed the result from K-Means and K-Medoids algorithm and time performance using High Performance Computing (HPC) Cluster to parallelize K-Means and K-Medoids algorithms and using Message Passing Interface (MPI) library. The results shown that K-Means algorithm gives smaller SSE than K-Medoids. And also parallel algorithm that used MPI gives faster computation time than sequential algorithm.</p>

Highlights

  • Nowadays, data generation advancement are massively and rapidly developed

  • Just like a research done by Jing Zhang, Gongqing Wu, Xuegang Hu, Shiying Li, Shuilang Hao titled “A Parallel K-means Clustering Algorithm with Message Passing Interface (MPI)”[1]

  • We propose High Performance Computing (HPC) Cluster approach to implement K-Means and K-Medoids in parallel platform

Read more

Summary

INTRODUCTION

Data generation advancement are massively and rapidly developed. Collecting any data is possible everywhere and anywhere. Gathering information and processed into knowledge could be done with data mining technique. Clustering is one of data mining technique. Clustering is a data mining technique which very useful for real problems [9]. Selection of clustering algorithm could be based on fata type or use of data. The problems is how we could process thousands dimensions data with a great accuracy and with a shortest time possible. Just like a research done by Jing Zhang, Gongqing Wu, Xuegang Hu, Shiying Li, Shuilang Hao titled “A Parallel K-means Clustering Algorithm with MPI”[1]. Parallel data clustering using Message Passing Interface (MPI) were done in this research to get a high accuracy and low computational time for clustering result on data mining process

K-MEANS CLUSTERING
K-MEDOIDS CLUSTERING
PARALLEL K-MEANS AND K-MEDOID CLUSTERING
CLUSTER EVALUATION
PARALLEL PERFORMANCE EVALUATION
DATASET
RESEARCH METHOD
Pre-processing Data
CLUSTERING PERFORMANCE
TIME EVALUATION OF SEQUENTIAL COMPUTATION
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.