Abstract

BackgroundReverse engineering of gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology. Thanks to high-throughput technologies, a massive amount of gene-expression data has been accumulated in the public repositories. Modelling GRNs from multiple experiments (also called integrative analysis) has; therefore, naturally become a standard procedure in modern computational biology. Indeed, such analysis is usually more robust than the traditional approaches, which suffer from experimental biases and the low number of samples by analysing individual datasets.To date, there are mainly two strategies for the problem of interest: the first one (“data merging”) merges all datasets together and then infers a GRN whereas the other (“networks ensemble”) infers GRNs from every dataset separately and then aggregates them using some ensemble rules (such as ranksum or weightsum). Unfortunately, a thorough comparison of these two approaches is lacking.ResultsIn this work, we are going to present another meta-analysis approach for inferring GRNs from multiple studies. Our proposed meta-analysis approach, adapted to methods based on pairwise measures such as correlation or mutual information, consists of two steps: aggregating matrices of the pairwise measures from every dataset followed by extracting the network from the meta-matrix. Afterwards, we evaluate the performance of the two commonly used approaches mentioned above and our presented approach with a systematic set of experiments based on in silico benchmarks.ConclusionsWe proposed a first systematic evaluation of different strategies for reverse engineering GRNs from multiple datasets. Experiment results strongly suggest that assembling matrices of pairwise dependencies is a better strategy for network inference than the two commonly used ones.

Highlights

  • Reverse engineeringof gene regulatory networks (GRNs) from gene expression data is a classical challenge in systems biology

  • Thanks to high-throughput technologies, a large amount of transcriptome data is available through public repositories (e.g. NCBI Gene expression Omnibus (GEO) [1], ArrayExpress [2]), providing opportunities to study the GRNs of many organisms

  • MI-based algorithms start by estimating a pairwise mutual information between all pairs of genes, resulting in a mutual information matrix (MIM)

Read more

Summary

Results

Simulated datasets There are two tasks one needs to consider in order to validate networks: 1) defining a “gold standard” - which is a set of true regulations describing the underlying interaction network, 2) selecting quantitative measures to statistically assess the quality of inferred networks. Iii ) Level 3: In addition to the two previous noises, each sub-dataset can be transformed using a randomly chosen non-linear transformation such as x2 or log(x) This random data transformation is not really meant to be realistic but rather to allow us to better assess the behaviour of each meta-method when faced with extreme distortion. We can clearly observe that N2 outperforms all three D methods suggesting that assembling networks is better than merging datasets This could be explained by the fact that gene expression values are very dissimilar in various experiments due to our simulated batch effects (i.e. datasets with different global and local noise). It should be noted that assembling mutual information matrices (M methods) surpasses the two other well-known strategies (D and N) for all datasets under every different levels of distortion, in particular for MRNET In the M family of methods, it appears that combining MIM using random effect model (M1) is better than the two other strategies, the internal quality control index (M2) and the median method (M3)

Conclusions
Background
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.