Abstract

Background The gene-gene associations in functional genomics are typically operationalized through correlational or information-theoretical measures: as an undirected graph of pairwise interactions between genes, where nodes represent genes and edges represent associations between pairs of genes. Such a functional connectivity can be uninformative as expression of genes tend to be very densely correlated. Moreover, functional connectivity does not contain information about the causal, or directed, interactions between genes. Causal inference would allow to answer the question whether phenotypes are a product of multiple gene-gene interactions or rather, if there are epicentres in the genome, leading to the phenotypes by influencing a number of genes in the network. Methods We use open-access data from study by Jaffe et al. (2014), reporting the expression of genes within the Dorsolateral Pre-Frontal Cortex (DLPFC) in post-mortem humans. We selected the data from 239 genes that were found to be associated with Obsessive Compulsive Disorder (OCD) in this study, as well as a random selection of 239 genes that were not found to be associated with OCD. In order to avoid confounders, we only consider the control cohort of N=102 participants. The data were normalized using log-transform, first gene-wise and then subject-wise. The method is based on definition of causality (Pearl, 2000): if a high expression level in gene 1 is associated with high functional link between gene 1 and gene 2, we can infer that gene 1 has certain causal effect on gene 2. This influence can be established by windowing the data. The confidence intervals are then computed through permutation testing. Results The results demonstrate that, as opposed to functional connectivity, the directed connectivity clearly differentiates gene co-expression network associated with OCD from an exemplary co-expression network unrelated with OCD. Although inhibition in the network is potentially possible (as influence of a value lower than zero), it was not found. The highest significance was found for the following: MED9 -> SLC25A3, MED9 -> RALB, CTSZ -> GAD1, RGS7 -> DOK1, DCLK1 -> C6ORF57. These results are interesting since MED9 takes part in creating new mRNA, therefore its increase can indeed influence the transcription of new mRNA. Also, most connections appear relatively symmetrical, although symmetry is now imposed by the method. However, one particular gene, Calcium/calmodulin-dependent protein kinase type II subunit beta (CAMK2B), has substantially higher influence on the most of the other genes in the network than the feedback influence of these genes on CAMK2B. This suggests that CAMK2B might be a key component of the gene-expression network underlying OCD. Discussion Classic interventional studies can be problematic in causal research in the gene co-expression networks as often dozens to thousands of genes underlie one phenotype. In this work, we propose a framework which addresses this issue and quantifies causal interactions without the need to employ any experimental manipulation. This can help in understanding hierarchy in gene co-expression networks, and possibly, in localizing the genes lying at the beginning of a causal chain leading to a particular phenotype. The method proposed in this study is novel, model-free and nonparametric, and the preliminary results when applied to the gene co-expression network of OCD in DLPFC are very promising. Therefore, we believe that it should be further tested, and validated in larger cohorts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call