Abstract

Data mining is concerned with the discovery of interesting patterns and knowledge in data repositories. Cluster analysis which belongs to the core methods of data mining is the process of discovering homogeneous groups called clusters. Given a dataset and some measure of similarity between data objects, the goal in most clustering algorithms is maximising both the homogeneity within each cluster and the heterogeneity between different clusters. The multilevel paradigm suggests a hierarchical optimisation process going through different levels evolving from a coarse grain to fine grain strategy. The clustering problem is solved by first reducing the problem level by level to a coarser problem where an initial clustering is computed. The clustering of the coarser problem is mapped back level-by-level to obtain a better clustering of the original problem by refining the intermediate different clustering obtained at various levels. In this paper, a multilevel genetic algorithm and a multilevel K-means algorithm are introduced for solving the clustering problem. A benchmark using a number of datasets collected from a variety of domains is used to compare the effectiveness of the hierarchical approach against its single-level counterpart.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.