Abstract

Recently developed highly parallelized sequencing technologies allow now even small research groups to conduct multi-time point analysis in affordable time and cost, thus available time-series gene expression data sets are rapidly increasing. However, when the time series data generated from the different research groups are considered, the meta-properties of time series data such as time points and the age of samples become heterogeneous in the bunch of time series data. Thus, we propose a novel three-step analysis algorithm to integrate heterogeneous time series gene expression data set. The key ideas of the algorithm are to convert incomparable heterogeneous multi-time-point data into comparable DEG vectors using time-point clustering and to determine the consensus differentially expressed gene (DEG) vector for the input DEG vectors that minimize the sum of cosine distances. As tested with 12 low-temperature stress treated heterogeneous time-series gene expression data sets, our integration analysis algorithm showed the ability to detect low-temperature-responsive genes from 12 heterogeneous low temperature treated time series data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call