Abstract

Data clustering means to partition the samples in similar clusters; so that each cluster’s samples have maximum similarity with each other and have a maximum distance from the samples of other clusters. Due to the problem of unsupervised clustering selection of a specific algorithm for clustering a set of unknown data is involved in much risk, and we usually fail to find the best option. Because of the complexity of the issue and inefficacy of basic clustering methods, most studies have been directed toward combined clustering methods. We name output partition of a clustering algorithm as a result. Diversity of the results of an ensemble of basic clusterings is one of the most important factors that can affect the quality of the final result. The quality of those results is another factor that affects the quality of the final result. Both factors considered in recent research of combined clustering. We propose a new framework to improve the efficiency of combined clustering that is based on selection of a subset of primary clusters. Selection of a Proper subset has a crucial role in the performance of our method. The selection is done using intelligent methods. The main ideas of the proposed method for selecting a subset of the clusters are to use the clusters that are stable. This process is done by the intelligent search algorithms. To assess the clusters, stability criteria based on mutual information has been used. At last, the selected clusters are going to be aggregated by some consensus functions. Experimental results on several standard datasets show that the proposed method can effectively improve the complete ensemble method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.