Abstract

The clustering of mixed-attribute data is a vital and challenging issue. The density peaks clustering algorithm brings us a simple and efficient solution, but it mainly focuses on numerical attribute data clustering and cannot be adaptive. In this paper, we studied the adaptive improvement method of such an algorithm and proposed an adaptive mixed-attribute data clustering method based on density peaks called AMDPC. In this algorithm, we used the unified distance metric of mixed-attribute data to construct the distance matrix, calculated the local density based on K-nearest neighbors, and proposed the automatic determination method of cluster centers based on three inflection points. Experimental results on real University of California-Irvine (UCI) datasets showed that the proposed AMDPC algorithm could realize adaptive clustering of mixed-attribute data, can automatically obtain the correct number of clusters, and improved the clustering accuracy of all datasets by more than 22.58%, by 24.25%, by 28.03%, by 22.5%, and by 10.12% for the Heart, Cleveland, Credit, Acute, and Adult datasets compared to that of the traditional K-prototype algorithm, respectively. It also outperformed a modified density peaks clustering algorithm for mixed-attribute data (DPC_M) algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call