Abstract

It has been proved that ensemble learning is a solid approach to reach more accurate, stable, robust, and novel results in all data mining tasks such as clustering, classification, regression and etc. Clustering ensemble as a sub-field of ensemble learning is a general approach to improve the performance of clustering task. In this paper by defining a new criterion for clusters validation named Modified Normalized Mutual Information (MNMI), a clustering ensemble framework is proposed. In the framework first a large number of clusters are prepared and then some of them are selected for the final ensemble. The clusters which satisfy a threshold of the proposed metric are selected to participate in final clustering ensemble. For combining the chosen clusters, a co-association based consensus function is applied. Since the Evidence Accumulation Clustering (EAC) method can't derive the co-association matrix from a subset of clusters, Extended Evidence Accumulation Clustering (EEAC), is applied for constructing the co-association matrix from the subset of clusters. Employing this new cluster validation criterion, the obtained ensemble is evaluated on some well-known and standard datasets. The empirical studies show promising results for the ensemble obtained using the proposed criterion comparing with the ensemble obtained using the standard clusters validation criterion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call