Abstract

Micro array has been a widely used microscopic measurement that accumulates the expression levels of a large number of genes varying over different time points. Cluster analysis more over the concept of bi-clustering provides insight into meaningful information from the correlation of a subset of genes with a subset of conditions. This eventually helps in discovering biologically meaningful clusters over analyzing missing values, imprecision and noise present in micro array data set. Although the concept of fuzzy set is enough to deal with the overlapping nature of the bi-clusters but the use of shadowed set helps in identifying and analyzing the nature of the genes lying in the confusion area of the clusters. In this article, we have suggested a bi-clustering model of the shadowed set with gradual representation of cardinality and named it as Gradual shadowed set for gene expression (GSS-GE) clustering. It identifies the bi-clusters in the core and in the shadowed region and evaluates their biological significance. The excellence of the proposed GSS-GE has been demonstrated by considering three real data sets, namely yeast data, serum data and mouse data set. The performance is compared with Ching Church’s algorithm (CC), Bimax, order preserving sub matrix (OPSM), Large Average Sub matrices (LAS), statistical plaid model and a modified fuzzy co-clustering (MFCC) algorithm. For the mouse data set there is no cluster level analysis of the micro array has been done so far. We have also provided the statistical and biological significance to prove the superiority of the proposed GSS-GE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call