Sum of Distance based Algorithm for Clustering Web Data

Mahesh Motwani,Neeti Arora

doi:10.5120/15221-3732

Abstract

Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. The criterion for checking the similarity is implementation dependent.Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. K-means is a widely used clustering algorithm that chooses random cluster centers (centroid), one for each centroid. The performance of K-means strongly depends on the initial guess of centers (centroid) and the final cluster centroids may not be the optimal ones as the algorithm can converge to local optimal solutions. Therefore it is important for K-means to have good choice of initial centroids. An algorithm for clustering that selects initial centroids using criteria of finding sum of distances of data objects to all other data objects have been formed. The proposed algorithm results in better clustering on synthetic as well as real datasets when compared to the K-means technique.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sum of Distance based Algorithm for Clustering Web Data

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications

Lead the way for us

Journal: International Journal of Computer Applications	Publication Date: Feb 14, 2014
Citations: 7

Similar Papers

English
Neeti Arora
International Journal of Current Engineering and Technology | VOL. 4
Neeti AroraNeeti Arora
01 Jan 2010
International Journal of Current Engineering and Technology | VOL. 4

Ensembles in Machine Learning Applications
-
-
--
01 Jan 2010
01 Jan 2010

Clustering Mixed Datasets Using K-Prototype Algorithm Based on Crow-Search Optimization
Lakshmi K ... Shanthi S
-
Lakshmi K, et. al.Lakshmi K ... Shanthi S
01 Jan 2018
01 Jan 2018

Feature Selection using K-Means Genetic Clustering to Predict Rheumatoid Arthritis Disease
B Jayanthy* ... Dr.C Senthamarai*
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8
B Jayanthy*, et. al.B Jayanthy* ... Dr.C Senthamarai*
30 Sep 2019
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sum of Distance based Algorithm for Clustering Web Data

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications