Abstract

Currently, the detection of the allele asymmetry of gene expression from RNA-seq data or the transcription factor binding from ChIP-seq data is one of the approaches used to identify the functional genetic variants that can affect gene expression (regulatory SNPs or rSNPs). In this study, we searched for rSNPs using the data for human pulmonary arterial endothelial cells (PAECs) available from the Sequence Read Archive (SRA). Allele-asymmetric binding and expression events are analyzed in paired ChIP-seq data for H3K4me3 mark and RNA-seq data obtained for 19 individuals. Two statistical approaches, weighted z-scores and predicted probabilities, were used to improve the efficiency of finding rSNPs. In total, we identified 14,266 rSNPs associated with both allele-specific binding and expression. Among them, 645 rSNPs were associated with GWAS phenotypes; 4746 rSNPs were reported as eQTLs by GTEx, and 11,536 rSNPs were located in 374 candidate transcription factor binding motifs. Additionally, we searched for the rSNPs associated with gene expression using an SRA RNA-seq dataset for 281 clinically annotated human postmortem brain samples and detected eQTLs for 2505 rSNPs. Based on these results, we conducted Gene Ontology (GO), Disease Ontology (DO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses and constructed the protein–protein interaction networks to represent the top-ranked biological processes with a possible contribution to the phenotypic outcome.

Highlights

  • Single nucleotide polymorphisms (SNPs) are the most common type of sequence variation

  • This is especially important in the case of noncoding SNPs, which account for about 90% of the GWAS-associated genetic variants [7,8] and the functional interpretation of which is the most complex task

  • We focused on the genomic regions containing two or more overlapping transcription factor (TF) binding regions (OTFRs) [37]

Read more

Summary

Introduction

Single nucleotide polymorphisms (SNPs) are the most common type of sequence variation. GWAS technology is unable to give any information about the molecular mechanisms that determine the effect of these variants on the risks of diseases and, makes it necessary to perform laborious follow-up studies for each selected individual variant [4,5,6]. This is especially important in the case of noncoding SNPs, which account for about 90% of the GWAS-associated genetic variants [7,8] and the functional interpretation of which is the most complex task. Note that the functional interpretation is necessary for both an increase in the prognostic value of polymorphisms and the possibility to design new methods for correcting the associated clinical outcomes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call