Abstract
Gene regulatory network (GRN) reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called -score, usually perform better. A fundamental problem with the -score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV) based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the -score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be very helpful in guiding biological experiment designs.
Highlights
In the post-genomic era, one of the fundamental tasks is reconstructing gene regulatory networks (GRN) from experimental data and other a priori information
It is hoped that this reconstruction is helpful in both understanding cell functions and gaining additional insights about the processes of some complicated diseases that might lead to new target gene discovery
With the development of high-throughput technologies, such as DNA microarrays and mass spectroscopy, etc., it becomes possible to simultaneously collect thousands of gene expression data [1,2]. Stimulated by these technology advancements, a variety of different models and methods have been proposed for GRN reconstruction, such as Boolean networks [3,4], Bayesian networks [5,6], information theory based algorithms [7,8,9,10], ordinary differential equation (ODE) based methods [11,12,13], etc
Summary
In the post-genomic era, one of the fundamental tasks is reconstructing gene regulatory networks (GRN) from experimental data and other a priori information. It is suggested to modify the first 5 biggest RELVs of a gene obtained from knockout experimental data, provided that this gene is estimated to have a nonzero in-degree.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have