Analysis of large volume data processing using clustering algorithms

Sarada B,Udaya Rani V,Vinayaka Murthy M

doi:10.14419/ijet.v7i4.5.25058

Abstract

The study of large dataset with velocity, variety and volume which is also known as Big data. When the dataset has limited number of clusters, low dimensions and small number of data points the existing traditional clustering algorithms can be used.. As we know this is the internet age, the data is growing very fast and existing clustering algorithms are not giving the acceptable results in terms of time complexity and spatial complexity. So there is a need to develop a new approach of applying clustering of large volume of data processing with low time and spatial complexity through MapReduce and Hadoop frame work applying to different clustering algorithms, k-means, Canopy clustering and proposed algorithm .The analysis shows that the large volume of data processing will take low time and spatial complexity when compared to small volume of data.

Highlights

The data is increasing in terms of volume, variety, and velocity, the existing clustering algorithm takes more time to produce the results
MapReduce is one of the programming designs for large volumes of datasets in parallel .MapReduce with HDFS can be used to handle the big data,which is commonly known as Hadoop .Once the file is placed into HDFS it can be read n number of times
The execution time of K-Mean clustering Algorithm Given by O where n is the number of data points, k is the number of clusters, i is the number of iterations needed to converge and d is the dimensions

Summary

Introduction

The data is increasing in terms of volume, variety, and velocity, the existing clustering algorithm takes more time to produce the results. To produce results in terms of less time and less memory one should think of something big and that is parallel programing. MapReduce is one of the programming designs for large volumes of datasets in parallel .MapReduce with HDFS can be used to handle the big data ,which is commonly known as Hadoop .Once the file is placed into HDFS it can be read n number of times

Map reduce

Reduce phase

Proposed system

Existing system approach

Limitations

K-mean clustering algorithm

Canopykmeans clustering algorithm

Result and analysis

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis of large volume data processing using clustering algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering & Technology

Lead the way for us

Journal: International Journal of Engineering & Technology	Publication Date: Sep 22, 2018
License type: other-oa

Similar Papers

Big Data analytics and Computational Intelligence for Cyber–Physical Systems: Recent trends and state of the art applications
Rahat Iqbal ... Usman Yousuf
Future Generation Computer Systems | VOL. 105
Rahat Iqbal, et. al.Rahat Iqbal ... Usman Yousuf
20 Nov 2017
Future Generation Computer Systems | VOL. 105

How In-memory Technology Can Create Business Value: Insights from the Hilti Case
Jan Vom Brocke ... Nadine Reuter
Communications of the Association for Information Systems | VOL. 34
Jan Vom Brocke, et. al.Jan Vom Brocke ... Nadine Reuter
01 Jan 2014
Communications of the Association for Information Systems | VOL. 34

An approach to achieve high efficiency for large volume data processing using multiple clustering algorithms
Sarada B ... Vinayaka Murthy M
International Journal of Engineering & Technology | VOL. 7
Sarada B, et. al.Sarada B ... Vinayaka Murthy M
22 Sep 2018
International Journal of Engineering & Technology | VOL. 7

Robust and Optimal Contention Resolution without Collision Detection
Yonggang Jiang ... Chaodong Zheng
-
Yonggang Jiang, et. al.Yonggang Jiang ... Chaodong Zheng
11 Jul 2022
11 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of large volume data processing using clustering algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering &amp; Technology

More From: International Journal of Engineering & Technology