Co-clustering is the simultaneous clustering of the samples and attributes of a data matrix that provides deeper insight into data than traditional clustering. However, there is a lack of representation learning algorithms that serve this mechanism of co-clustering, and the current representation learning algorithms are limited to the sample perspective and lack the use of information in the attribute perspective. To solve this problem, in this article, ctSNE , a co-representation learning model based on t-distributed stochastic neighbor embedding, is proposed for unsupervised co-clustering, where ctSNE makes the dataset representation outputted more discriminative of row and column clusters (i.e. co-discrimination). On the basis of t-distributed stochastic neighbor embedding retaining the sample data distribution and local data structure, the philosophy of collaboration is introduced (i.e., row and column hidden relationship information) so that the ctSNE model is equipped with co-representation learning capability, which can effectively improve the performance of co-clustering. To prove the effectiveness of the ctSNE model, several classic co-clustering algorithms are used to check the co-representation performance of ctSNE, and a novel internal index based on an internal clustering index, known as total inertia, is proposed to demonstrate the effect of co-clustering. The numerous experimental results show that ctSNE has tremendous co-representation capability and can significantly improve the performance of co-clustering algorithms.
Read full abstract