Abstract

In recent years, several cluster ensemble methods have been developed, but they still have some limitations. They commonly use different clustering algorithms in both stages of the clustering ensemble method, such as the ensemble generation step and the consensus function, resulting in a compatibility issue in terms of working functionality between different clustering algorithms. In addition, in a clustering ensemble method, the accuracy of the final results is a major concern. To deal with it, we propose a novel cluster ensemble method based on a single clustering algorithm (CES). In this method, we iterate a clustering algorithm affinity propagation (AP) ten times in the ensemble generation step to obtain multiple base partitions with a high level of diversity in each iteration due to its nature of producing a random number of clusters. Furthermore, with a few modifications, the same algorithm AP is used to propose a novel consensus function for combining these base partitions into a single partition. The proposed consensus function takes advantage of little side-information in the form of partial labels by using pairwise constraints with AP and number of clusters in a dataset. By employing this information, AP is limited to produce an actual number of cluster centres in a dataset rather than a random number of clusters, which considerably enhanced the accuracy of final outcomes. As a result, CES uses the same clustering functionality in both stages of proposed cluster ensemble method and produces the desired number of clusters in the final partition of a dataset which is significantly improving accuracy when compared to state-of-the-art cluster ensemble methods. Furthermore, as a result of these modifications, the CES outperforms AP in terms of accuracy and execution time. Experiments on real-world datasets from various sources show that CES improves accuracy by 5% on average compared to state-of-the-art cluster ensemble methods and by 55.54% compared to AP while consuming 44.60% less execution time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call