Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates

Sen Su,Jiangtao Yin,Xiang Cheng,Lixin Gao

doi:10.1109/icdm.2013.76

Abstract

Co-clustering is a powerful data mining tool for co-occurrence and dyadic data. As data sets become increasingly large, the scalability of co-clustering becomes more and more important. In this paper, we propose two approaches to parallelize co-clustering with sequential updates in a distributed environment. Based on these two approaches, we present a new distributed framework, Co-ClusterD, that supports efficient implementations of co-clustering algorithms with sequential updates. We design and implement Co-ClusterD, and show its efficiency through two co-clustering algorithms: fast nonnegative matrix tri-factorization (FNMTF) and information theoretic co-clustering (ITCC). We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that co-clustering algorithms implemented in Co-ClusterD can achieve better results and run faster than their traditional concurrent counterparts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates
Xiang Cheng ... Sen Su
IEEE Transactions on Knowledge and Data Engineering | VOL. 27
Xiang Cheng, et. al.Xiang Cheng ... Sen Su
01 Dec 2015
IEEE Transactions on Knowledge and Data Engineering | VOL. 27

OvNMTF Algorithm: an Overlapping Non-Negative Matrix Tri-Factorization for Coclustering
Waldyr L De Freitas ... Lucas Fernandes Brunialti
-
Waldyr L De Freitas, et. al.Waldyr L De Freitas ... Lucas Fernandes Brunialti
01 Jul 2020
01 Jul 2020

HICC: an entropy splitting-based framework for hierarchical co-clustering
Wei Cheng ... Wei Wang
Knowledge and Information Systems | VOL. 46
Wei Cheng, et. al.Wei Cheng ... Wei Wang
10 Feb 2015
Knowledge and Information Systems | VOL. 46

Single multiplicatively updated matrix factorization for co-clustering
Zhao Li ... Xindong Wu
-
Zhao Li, et. al.Zhao Li ... Xindong Wu
24 Mar 2014
24 Mar 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates

Abstract

Talk to us

Similar Papers