Abstract
BackgroundInferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications.ResultsHere we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study.ConclusionsOur study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
Highlights
Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies
We show and discuss the receiver operating characteristic (ROC) curves deduced for the basic algorithm with identity (ID) scoring scheme in combination with all measures included in our investigation, and for the five additional scoring schemes Context Likelihood of Relatedness (CLR), Accurate Cellular NEtworks (ARACNE), minimum redundancy NETwork (MRNET), Time Shift (TS) and Asymmetric WEighting (AWE) - with selected measures
Symmetric scoring schemes - CLR, ARACNE and MRNET we evaluate the possibility for reconstruction of the underlying E. colinetwork based on the three modifications of the relevance network algorithm (Algorithm 1, given in the Methods section) as implemented in the “minet“-package, namely the CLR, the ARCANE and the MRNET
Summary
Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. The ever-increasing throughput in experimental manipulation of gene activity coupled with the methods for quantitative assessment of transcriptome, proteome, and metabolome have begun to identify the effects of individual transcription factors, binding ligands, and post-translational modifications on regulated genes [10] Such high-throughput transcriptomics data sets can be used to identify gene regulatory modules and entire networks. Understanding the complex network of gene regulatory interactions from a given transcriptome read-out necessitates the design, analysis, and testing of network-inference methods (so-called reverse engineering methods) These methods operate on two types of data sets from: (1) static perturbation experiments whose read-out is a pseudo steady-state expression level, and (2) time-resolved experiments yielding time series of gene expression
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.