Verification and validation of MapReduce program model for parallel K-means algorithm on Hadoop cluster

Amresh Kumar,B R Prathap,M Kiran

doi:10.1109/icccnt.2013.6726852

Abstract

With the development of information technology, a large volume of data is growing and getting stored electronically. Thus, the data volumes processing by many applications will routinely cross the petabyte threshold range, in that case it would increase the computational requirements. Efficient processing algorithms and implementation techniques are the key in meeting the scalability and performance requirements in such scientific data analyses. So for the same here, we have p analyzed the various MapReduce Programs and a parallel clustering algorithm (PKMeans) on Hadoop cluster, using the Concept of MapReduce. Here, in this experiment we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Means Clustering Algorithm. We have found that as the number of nodes increases the execution time decreases, but also some of the interesting cases has been found during the experiment and recorded the various performance change and drawn different performance graphs. This experiment is basically a research study of above MapReduce applications and also to verify and validate the MapReduce Program model for Parallel K-Means algorithm on Hadoop Cluster having four nodes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Verification and validation of MapReduce program model for parallel K-means algorithm on Hadoop cluster

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Verification and validation of Parallel Support Vector Machine algorithm based on MapReduce Program model on Hadoop cluster
M Kiran ... Amresh Kumar
-
M Kiran, et. al.M Kiran ... Amresh Kumar
01 Dec 2013
01 Dec 2013

Genetic Algorithm Based Parallel K-Means Data Clustering Algorithm Using MapReduce Programming Paradigm on Hadoop Environment (GAPKCA)
Sayer Alshammari ... Maslina Binti Zolkepli
-
Sayer Alshammari, et. al.Sayer Alshammari ... Maslina Binti Zolkepli
05 Dec 2019
05 Dec 2019

A parallel hierarchical clustering algorithm for PCs cluster system
Zhonghui Feng ... Junyi Shen
Neurocomputing | VOL. 70
Zhonghui Feng, et. al.Zhonghui Feng ... Junyi Shen
25 Oct 2006
Neurocomputing | VOL. 70

PSCAN: A Parallel Structural Clustering Algorithm for networks
Jia-Jun Chen ... Jie Liu
-
Jia-Jun Chen, et. al. Jia-Jun Chen ... Jie Liu
01 Jul 2013
01 Jul 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Verification and validation of MapReduce program model for parallel K-means algorithm on Hadoop cluster

Abstract

Talk to us

Similar Papers