Abstract

AbstractEnsemble clustering is an efficient unsupervised learning technique that has attracted a lot of attention. The purpose of this technique is to aggregate the results of several basic clustering algorithms in order to create a better clustering. This is not only possible, but has been developed with many techniques in recent years. However, there are still challenges such as similarity measure, agreement function and how to implement clustering algorithms. To address these challenges, this article proposes an ensemble clustering algorithm based on the consistency cluster consensus approach and the MapReduce model. In addition, the proposed algorithm is equipped with a new membership similarity measure. The consistency cluster consensus is defined as a new agreement function for the consensus of the results of the basic clustering methods. Besides, the proposed similarity measure consists of two factors: one is cluster similarity and the another is membership similarity. The process of the proposed ensemble clustering method is summarized in four steps. In the first step, partitions are generated by applying a number of individual clustering algorithms to the data. The second step is to convert the primary clusters from the partitions to a binary representation. The third step is to identify the candidate primary clusters for the consensus, where a subset with highest similarity between the clusters is emphasized. In the fourth step, the consistency cluster consensus is performed through a new agreement function. In addition, we use the MapReduce model to implement basic clustering algorithms to reduce computational complexity. Our method is evaluated on several real‐world datasets and their performance was compared with other clustering ensemble methods, such as TWCE and RDPC‐DSS. Experiments show that our method is more efficient and accurate and is 7% to 15% better than the best method compared.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.