Abstract

Most of the existing clustering methods have difficulty in processing complex nonlinear data sets. To remedy this deficiency, in this paper, a novel data model termed Hybrid K-Nearest-Neighbor (HKNN) graph, which combines the advantages of mutual k-nearest-neighbor graph and k-nearest-neighbor graph, is proposed to represent the nonlinear data sets. Moreover, a Clustering method based on the HKNN graph (CHKNN) is proposed. The CHKNN first generates several tight and small subclusters, then merges these subclusters by exploiting the connectivity among them. In order to select the optimal parameters for CHKNN, we further propose an internal validity index termed K-Nearest-Neighbor Index (KNNI), which can also be used to evaluate the validity of nonlinear clustering results by varying a control parameter. Experimental results on synthetic and real-world data sets, as well as that on the video clustering, have demonstrated the significant improvement on performance over existing nonlinear clustering methods and internal validity indices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call