Abstract

In unsupervised data mining, one is usually interested in discovering groups of data that exhibit certain kind of coherency. A classical technique for unsupervised data partitioning is cluster analysis, where objects are sorted into groups in such a way that the degree of association between two objects is maximal if they belong to the same group and minimal otherwise. Cluster analysis has been applied to many classification problems. In (Wu, Liew, & Yan, 2004), clustering is applied to find natural groupings in the data. In (Borland, Hirschberg, & Lye, 2001), clustering is used for data reduction, where a group of similar objects is summarized by a representative sample in the group. Recently, clustering has been applied extensively in gene expression data analysis. In gene expression data, the objects along the row dimension correspond to genes or some DNA sequence, and the attributes in the column dimension correspond to cDNA microarray experiments or time point samples. Clustering in the row direction, or gene-wise clustering, has been done, for example, on the Yeast gene expression data and human cell (Spellman, Sherlock, Zhang, et al., 1998; Eisen, Spellman, Brown, & Botstein, 1998), whereas clustering in the column direction, or sample-wise clustering, has been done, for example, on cancer type classification (Golub, Slonim, Tamayo, et al., 1999) (Klein, Tu, Stolovitzky, et al., 2001). However, in many real world data, not all attributes of an object are relevant in grouping the objects into meaningful classes. In many cases, some attributes are relevant to only some of the clusters and different clusters may have different relevant subsets of attributes. By relaxing the constraint that related objects must behave similarly across the entire set of attributes, biclustering considers only a relevant subset of attributes when looking for similarity between objects. In this article, we give an overview of the biclustering problem, discuss some common biclustering algorithms, and highlight some interesting applications of biclustering.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.