Abstract
Cluster ensemble is one of the main branches in the ensemble learning area which is an important research focus in recent years. The objective of cluster ensemble is to combine multiple clustering solutions in a suitable way to improve the quality of the clustering result. In this paper, we design a new noise immune cluster ensemble framework named as $AP^{2}CE$ to tackle the challenges raised by noisy datasets. $AP^{2}CE$ not only takes advantage of the affinity propagation algorithm (AP) and the normalized cut algorithm (Ncut), but also possesses the characteristics of cluster ensemble. Compared with traditional cluster ensemble approaches, $AP^{2}CE$ is characterized by several properties. ( $1$ ) It adopts multiple distance functions instead of a single Euclidean distance function to avoid the noise related to the distance function. ( $2$ ) $AP^{2}CE$ applies AP to prune noisy attributes and generate a set of new datasets in the subspaces consists of representative attributes obtained by AP. ( $3$ ) It avoids the explicit specification of the number of clusters. ( $4$ ) $AP^{2}CE$ adopts the normalized cut algorithm as the consensus function to partition the consensus matrix and obtain the final result. In order to improve the performance of $AP^{2}CE$ , the adaptive $AP^{2}CE$ is designed, which makes use of an adaptive process to optimize a newly designed objective function. The experiments on both synthetic and real datasets show that ( $1$ ) $AP^{2}CE$ works well on most of the datasets, in particular the noisy datasets; ( $2$ ) $AP^{2}CE$ is a better choice for most of the datasets when compared with other cluster ensemble approaches; ( $3$ ) $AP^{2}CE$ has the capability to provide more accurate, stable and robust results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.