Abstract

Conventional clustering algorithms based on Euclidean distance or Pearson correlation coefficient are not able to include order information in the distance metric and also unable to distinguish between random and real biological patterns. We present template based clustering algorithm for time series gene expression data. Template profiles are defined based on up-down regulation of genes between consecutive time points. Assignment of genes to templates is based on fuzzy membership function. Multi-objective evolutionary algorithm is used to determine compact clusters with varying number of templates. Statistical significance of each template is determined using permutation based non-parametric test. Statistically significant profiles are further tested for their biological relevance using gene ontology analysis. The algorithm was able to distinguish between real and noisy pattern when tested on artificial and real biological data. The proposed algorithm has shown better or similar performance compared to STEM and better than k-means on a real biological data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call