Abstract
The complexity of the data type and distribution leads to the increase in uncertainty in the relationship between samples, which brings challenges to effectively mining the potential cluster structure of data. Ensemble clustering aims to obtain a unified cluster division by fusing multiple different base clustering results. This paper proposes a three-way ensemble clustering algorithm based on sample’s perturbation theory to solve the problem of inaccurate decision making caused by inaccurate information or insufficient data. The algorithm first combines the natural nearest neighbor algorithm to generate two sets of perturbed data sets, randomly extracts the feature subsets of the samples, and uses the traditional clustering algorithm to obtain different base clusters. The sample’s stability is obtained by using the co-association matrix and determinacy function, and then the samples can be divided into a stable region and unstable region according to a threshold for the sample’s stability. The stable region consists of high-stability samples and is divided into the core region of each cluster using the K-means algorithm. The unstable region consists of low-stability samples and is assigned to the fringe regions of each cluster. Therefore, a three-way clustering result is formed. The experimental results show that the proposed algorithm in this paper can obtain better clustering results compared with other clustering ensemble algorithms on the UCI Machine Learning Repository data set, and can effectively reveal the clustering structure.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.