Abstract

Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.

Highlights

  • Identifying noncoding risk variants remains a challenging task

  • We postulate that (1) the impact of causal enhancer SNPs (eSNPs) on gene expression is transmitted through the gene regulatory network (GRN) in the cell/tissue types that are relevant to the studied trait; and (2) the genes affected by the full set of causal eSNPs for a trait are organized in a limited number of pathways

  • Using gold-standard noncoding variants, we demonstrate that genes targeted by causal single nucleotide polymorphisms (SNPs) exhibit characteristic network features compared to genes targeted by noncausal SNPs

Read more

Summary

Introduction

Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. These methods operate by annotating genetic variants using a catalog of cis-regulatory sequences (based on chromatin accessibility, transcription factor binding, epigenetic modification signatures) Biologically intuitive, such an approach does not take into account the complex interactions of the underlying gene regulatory network (GRN) in which a causal noncoding variant exerts its effect, namely, interactions among transcription factors and their target genes as well as interactions among target genes in the same pathway. We postulate that (1) the impact of causal eSNPs on gene expression is transmitted through the GRNs in the cell/tissue types that are relevant to the studied trait; and (2) the genes affected by the full set of causal eSNPs for a trait are organized in a limited number of pathways We test this hypothesis by developing a general computational framework for identifying causal noncoding variants that affect a specific disease/trait. By applying our method to seven autoimmune diseases, we obtain a systems view of the entire set of risk eSNPs in a given disease and important the subnetwork that is perturbed by the set of risk eSNPs

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call