Abstract

Expression quantitative trait loci (eQTL) analysis is useful for identifying genetic variants correlated with gene expression, however, it cannot distinguish between causal and nearby non-functional variants. Because the majority of disease-associated SNPs are located in regulatory regions, they can impact allele-specific binding (ASB) of transcription factors and result in differential expression of the target gene alleles. In this study, our aim was to identify functional single-nucleotide polymorphisms (SNPs) that alter transcriptional regulation and thus, potentially impact cellular function. Here, we present regSNPs-ASB, a generalized linear model-based approach to identify regulatory SNPs that are located in transcription factor binding sites. The input for this model includes ATAC-seq (assay for transposase-accessible chromatin with high-throughput sequencing) raw read counts from heterozygous loci, where differential transposase-cleavage patterns between two alleles indicate preferential transcription factor binding to one of the alleles. Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression. By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also in eQTL regions. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In summary, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets.

Highlights

  • Expression quantitative trait loci analysis has developed over the years into a powerful tool to investigate the effects of genetic variants on gene regulatory networks, identify quantitative traits for complex diseases, and derive causal inference frameworks for genomic markers and gene expression (Fagny et al, 2017; van der Wijst et al, 2020)

  • We found that 30.2% (16/53) of the regulatory single-nucleotide polymorphisms (SNPs) in MCF-7 and 42.6% (52/122) of the regulatory SNPs in mesenchymal stem cells (MSC) were Expression quantitative trait loci (eQTL) (Supplementary Tables S3, S4)

  • EQTLs and genome-wide association study (GWAS) are conventional methods for mapping association variants that have identified tens of thousands of loci that are highly correlated with gene expression and common diseases, respectively

Read more

Summary

Introduction

Expression quantitative trait loci (eQTL) analysis has developed over the years into a powerful tool to investigate the effects of genetic variants on gene regulatory networks, identify quantitative traits for complex diseases, and derive causal inference frameworks for genomic markers and gene expression (Fagny et al, 2017; van der Wijst et al, 2020). EQTL analysis is designed to investigate how single-nucleotide polymorphisms (SNPs) in regulatory elements directly modify the abundance of a gene transcript. The major goal of an eQTL study is to reduce the large number of variants identified from a genome-wide association study (GWAS) to a list of potential causal SNPs for further investigation into how the locus contributes to disease. We aim to identify functional SNPs in key regulatory regions that alter transcriptional regulation and potentially impact cellular function. Such variants would be important for investigating the etiology of the associated disease and for identifying potential therapeutic targets

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call