Abstract

Initial solution selection is an important process in determining an optimal solution for the efficient classification of clustering methods in machine learning. This study proposes an improved initialization method for a simple and fast Kmedoids clustering algorithm, considering not only the sum of the relative distance rates for each data point, but also the distance between the previously selected medoids (representative objects). Selection of the balanced initial medoids causes the dissimilarity of patterns in the same cluster to be small and the dissimilarity of patterns in different clusters to be large. The performance of the proposed method is verified and validated based on two valid indices of clustering solutions by employing seven machine learning repository datasets of prominent applications with large improvements. The Mann–Whitney test is employed to assess the statistical significance of the performance differences between the method proposed by Park and Jun (2009) and the initialization method for clustering proposed here. The proposed initial selection method is effective, highly reliable, and computationally practical even for large problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call