Abstract

Conventional k-means clustering is the widely used partitional method, mainly adapted to machine learning and pattern recognition problems. This algorithm is highly sensitive to initial centroid points, but it cannot guarantee to arrive at a better solution because initial centroids are computed randomly for the given cluster. In this paper, we have developed a new initialization method for k-means clustering. We have also made an effort to improve the Dunn Index and introduced a new validity ratio based on the silhouette index. The sum of squared error, Dunn Index, silhouette index, modified Dunn Index, and silhouette validity ratio were used as criteria to evaluate the performance of the initialization algorithm. Various benchmark datasets have been used to assess the effectiveness of the proposed initialization algorithm, and we compared the results with conventional k-means and k-means++ algorithms. The results have shown that the sum of squared error and number of iterations obtained by our proposed initialization algorithm are minimum. A precision chart is used to test the consistency of the initialization algorithm. The comparative analysis, based on the modified Dunn Index, and silhouette validity ratio have proved that the proposed initialization algorithm has performed better than the other initialization algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.