Abstract

Recently developed highly parallelized sequencing technologies allow now even small research groups to conduct multi-time point analysis in affordable time and cost, thus available time-series gene expression data sets are rapidly increasing. However, when the time series data generated from the different research groups are considered, the meta-properties of time series data such as time points and the age of samples become heterogeneous in the bunch of time series data. Thus, we propose a novel three-step analysis algorithm to integrate heterogeneous time series gene expression data set. The key ideas of the algorithm are to convert incomparable heterogeneous multi-time-point data into comparable DEG vectors using time-point clustering and to determine the consensus differentially expressed gene (DEG) vector for the input DEG vectors that minimize the sum of cosine distances. As tested with 12 low-temperature stress treated heterogeneous time-series gene expression data sets, our integration analysis algorithm showed the ability to detect low-temperature-responsive genes from 12 heterogeneous low temperature treated time series data set.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.