Abstract
Knowledge-based clustering algorithms can improve traditional clustering models by introducing domain knowledge to identify the underlying data structure. While there have been several approaches to clustering with the guidance of knowledge tidbits, most of them mainly focus on numeric knowledge without considering the uncertain nature of information. To capture the uncertainty of information, pure numeric knowledge tidbits are expanded to knowledge granules in this article. Then, two questions arise: how to obtain granular knowledge and how to use those knowledge granules in clustering. To the end, a novel knowledge extraction and granulation (KEG) method and a granular knowledge-based fuzzy clustering model are proposed in this study. First, inspired by the concept of natural neighbors, an automatic KEG is developed. In KEG, high-density points are filtered from the dataset and then merged with their natural neighbors to form several dense areas, i.e., granular knowledge. Furthermore, the granular knowledge expressed by interval or triangular numbers is leveraged into the clustering algorithm, which is the framework of fuzzy clustering with granular knowledge. To concretize this model into clustering algorithms, the classical fuzzy C-Means clustering algorithm has been selected to incorporate the granular knowledge produced by KEG. Then, the corresponding fuzzy C-Means clustering with interval knowledge granules (IKG-FCM) and triangular knowledge granules (TKG-FCM) are proposed. Experiments on synthetic and real-world datasets demonstrate that IKG-FCM and TKG-FCM always achieve better clustering performance with less time cost, especially on imbalanced data, compared with state-of-the-art algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.