Abstract

Clustering is a common data mining technique whose main principle states that the samples within a cluster are similar to one another and dissimilar to those in other clusters. This means that samples in the same cluster possess high homogeneity, while different clusters possess high heterogeneity. However, a user may require a result of diversified clustering. Compared to traditional clustering methods, the aim of diversified clustering is to make samples of the same cluster possess high heterogeneity, and different clusters possess high homogeneity. Diversified clustering can be practically applied to aspects of our daily lives such as normal class grouping, student grouping in learning, cluster sampling, balanced diets and assignment of jobs. Nevertheless, our survey of related papers in the research field of data mining found that there has been no proposed research for diversified clustering. In this paper, we formal define the problem of diversified clustering and propose a new method to solve this problem. Experimental results showed that our method can generate good diversified clustering. However, our method is currently only appropriate for small data sets since the execution time of our method increases quickly as the number of diversified clusters increases. We also hope this paper will garner interest in more research on effective methods to generate diversified clusters for use in data mining.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call