Abstract

In data mining, the input of most algorithms is a data set in which each example is a feature vector. However, in many real applications an example usually contains multiple feature vectors and its observed classification is the responsibility of all feature vectors. We call this example kind matrix-object. Some existing clustering algorithms for matrix-object data fail to consider contributions of attributes to clusters, which may degrade clustering solutions due to less discriminative attributes. Some existing clustering algorithms for the data in which each example is a vector consider the contributions but encounter difficulties in handling matrix-object data. For matrix-object data, ordered and cross matrix-object distributions may exist in a cluster and cause different ways of measuring qualities of clusters. In this paper, we propose a weighted matrix-object data clustering algorithm guided by matrix-object distributions. We define cluster and matrix-object compactness respectively for the two distributions to measure qualities of clusters. The bigger the compactness is, the higher the quality is. So the proposed algorithm utilizes the compactness to assign a weight to each attribute for each cluster and maximizes weighted cluster and matrix-object compactness to find the optimal weight and the final clustering partition. Furthermore, a regular term about weight is added to the objective function to make more higher discriminative attributes participate in the optimization. Experimental results on real data have shown the effectiveness of the proposed algorithm. Compared with previous clustering algorithms, the proposed algorithm improves the clustering performance and enhances the interpretability of clustering results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call