Abstract
Density-based clustering algorithms are for clustering the data with arbitrary shapes. However, most of these algorithms face difficulties in handling the high-dimensional data with varying densities; especially, they cannot well discover the clusters in sparse regions. In this paper, we define a new type of density, local gap density, in the k-NN graph which works well for high-dimensional data. The local gap density of each point considers not only the number of all points in its nearest neighbor but also the average distance from this point to all points in this nearest neighbor. In this way, the core points in sparse regions in the sense of existing density-based clustering have high densities in our density definition, so they can be easily detected. By the core points, the potential cross-cluster edges in the k-NN graph can be well identified. After deleting these edges, we group all the points in each component with large cardinality as a subcluster, and then, similar to density peaks clustering, assign each remaining point to its corresponding existing subcluster. Extensive experiments on eight publicly available datasets demonstrate the effectiveness of our clustering algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.