COMPARATIVE STUDY OF MUTATION OPERATORS IN THE GENETIC ALGORITHMS FOR THE K-MEANS PROBLEM

Rui Li,Lev A Kazakovtsev

doi:10.22190/fumi2004091l

Abstract

The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The eﬃciency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative eﬃciency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the inﬂuence of various mutation operators on the eﬃciency of the genetic algorithms only.

Full Text