Data Mining Using Hierarchical Agglomerative Clustering Algorithm in Distributed Cloud Computing Environment

Kriti Srivastava,R Shah,D Valia,H Swaminarayan

doi:10.7763/ijcte.2013.v5.741

Abstract

 Abstract—Today increase in worldwide business led to offices distributed across geographical location .Hence data are loosely distributed across regionalized large scale databases across regionalized offices. To perform data mining it is required to merge distributed data and perform data mining algorithm on it. Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. This document presents a way to implement Hierarchical Agglomerative Clustering Algorithm in such way so as to make it suitable for large dataset and increase its efficiency by executing task in parallel. The result shows that with increase in data set linear growth of execution time.

Full Text