Abstract

This paper describes a new technique for clustering short time series of gene expression data. The technique is a generalization of the template-based clustering and is based on a qualitative representation of profiles which are labelled using trend Temporal Abstractions (TAs); clusters are then dynamically identified on the basis of this qualitative representation. Clustering is performed in an efficient way at three different levels of aggregation of qualitative labels, each level corresponding to a distinct degree of qualitative representation. The developed TA-clustering algorithm provides an innovative way to cluster gene profiles. We show the developed method to be robust, efficient and to perform better than the standard hierarchical agglomerative clustering approach when dealing with temporal dislocations of time series. Results of the TA-clustering algorithm can be visualized as a three-level hierarchical tree of qualitative representations and as such easy to interpret. We demonstrate the utility of the proposed algorithm on a set of two simulated data sets and on a study of gene expression data from S. cerevisiae.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call