Abstract

Ensemble clustering is one of the research hotspots of data mining in recent years. The selection of high-quality and large-diversity base clustering results plays a key role in the quality of the final result. Traditional ensemble clustering selection algorithms usually treat each base clustering result as a whole which ignores the difference between the clusters in the same clustering result. It may cause the validity of the final clustering result to be affected. Aiming at this problem, inspired by the measurement method of uncertainty in the rough set theory, a dual-granularity weighted ensemble clustering model is proposed. The main contribution of this paper is shown as follows: (1) the evaluation of the reliability of clusters is transformed into an uncertainty measurement problem in the rough set; (2) in a finer-grained level, a sample local similarity measurement method is designed; (3) a weighted co-association matrix elements generation method based on global cluster reliability and local sample pair similarity is proposed, then the fusion function is used to get the final clustering result. Experimental results show that the proposed method is not sensitive to the size and diversity of base clustering members which has good robustness and stability. The final result obtained by this model is closer to the actual distribution of data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.