Abstract

BackgroundThe increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. Present methods used to infer functional linkages between genes from expression data are mainly based on a point-to-point comparison. Change trends between consecutive time points in time-series data have been so far not well explored.ResultsIn this work we present a new method based on extracting main features of the change trend and level of gene expression between consecutive time points. The method, termed as trend correlation (TC), includes two major steps: 1, calculating a maximal local alignment of change trend score by dynamic programming and a change trend correlation coefficient between the maximal matched change levels of each gene pair; 2, inferring relationships of gene pairs based on two statistical extraction procedures. The new method considers time shifts and inverted relationships in a similar way as the local clustering (LC) method but the latter is merely based on a point-to-point comparison. The TC method is demonstrated with data from yeast cell cycle and compared with the LC method and the widely used Pearson correlation coefficient (PCC) based clustering method. The biological significance of the gene pairs is examined with several large-scale yeast databases. Although the TC method predicts an overall lower number of gene pairs than the other two methods at a same p-value threshold, the additional number of gene pairs inferred by the TC method is considerable: e.g. 20.5% compared with the LC method and 49.6% with the PCC method for a p-value threshold of 2.7E-3. Moreover, the percentage of the inferred gene pairs consistent with databases by our method is generally higher than the LC method and similar to the PCC method. A significant number of the gene pairs only inferred by the TC method are process-identity or function-similarity pairs or have well-documented biological interactions, including 443 known protein interactions and some known cell cycle related regulatory interactions. It should be emphasized that the overlapping of gene pairs detected by the three methods is normally not very high, indicating a necessity of combining the different methods in search of functional association of genes from time-series data. For a p-value threshold of 1E-5 the percentage of process-identity and function-similarity gene pairs among the shared part of the three methods reaches 60.2% and 55.6% respectively, building a good basis for further experimental and functional study. Furthermore, the combined use of methods is important to infer more complete regulatory circuits and network as exemplified in this study.ConclusionThe TC method can significantly augment the current major methods to infer functional linkages and biological network and is well suitable for exploring temporal relationships of gene expression in time-series data.

Highlights

  • The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes

  • The percentage of the inferred gene pairs consistent with databases by our method is generally higher than the local clustering (LC) method and similar to the Pearson correlation coefficient (PCC) method

  • The trend correlation (TC) method are found by the PCC and LC methods respectively

Read more

Summary

Introduction

The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. The presently often used approach is to compare gene expression in discrete time points resulting for example of different genotypes or cell lines, morbid and healthy (control) objects or under different physiological conditions This type of static gene expression profiling can already give useful information on the patterns of significantly differentiated expression of genes. Most of the current work in this area is limited to the analysis of a relatively small set of genes due to computational complexity [9,10,11] Another class of methods infers functional association from large-scale gene expression data by defining a statistic threshold for the association. By simplifying a profile of time series into a sequence of decrease or increase events this method is more robust to noises It does not fully make use of the information contained in the gene expression levels in the original data. The rank of expressional values is more insensitive to noses or outliers but this method is still based on the point-to-point comparison per se

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.