Abstract
The Cluster analysis is a major technique for statistical analysis, machine learning, pattern recognition, data mining, image analysis and bioinformatics. K-means algorithm is one of the most important clustering algorithms. However, the k-means algorithm needs a large amount of computational time for handling large data sets. In this paper, we developed more efficient clustering algorithm to overcome this deficiency named Fast Balanced k-means (FBK-means). This algorithm is not only yields the best clustering results as in the k-means algorithm but also requires less computational time. The algorithm is working well in the case of balanced data.
Highlights
The problem of clustering is perhaps one of the most widely studied in the data mining and machine learning communities
In k-means algorithm, a cluster is represented by the mean value of data points within a cluster and the clustering is done by minimizing the sum of distances between data points and the corresponding cluster centers
The genetic clustering algorithm (GA) parameters that have been used in the experimental: the population size = 10, selection is roulette, crossover is single point crossover, the probability of crossover
Summary
The problem of clustering is perhaps one of the most widely studied in the data mining and machine learning communities. The kmeans clustering algorithm [7] is one of the most efficient clustering algorithms for large-scale spherical data sets. It has extensive applications in such domains as financial fraud, medical diagnosis, image processing, information retrieval, and bioinformatics [8]. The k-means algorithm and its approaches are known to be fast algorithms for solving such problems They are sensitive to the choice of starting points and can only be applied to small datasets [10]. The multi restarting k-means algorithm becomes very time consuming and inefficient for solving clustering problems, even in moderately large datasets [11]. A new clustering algorithm is proposed for clustering large data sets called FBK-means.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have