Abstract
BackgroundMost cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts. To systematically identify the SNPs that affect gene expression by modulating activities of distal regulatory elements, we adapt the self-transcribing active regulatory region sequencing (STARR-seq) strategy, a high-throughput technique to functionally quantify enhancer activities.ResultsFrom 10,673 SNPs linked with 996 cancer risk-associated SNPs identified in previous GWAS studies, we identify 575 SNPs in the fragments that positively regulate gene expression, and 758 SNPs in the fragments with negative regulatory activities. Among them, 70 variants are regulatory variants for which the two alleles confer different regulatory activities. We analyze in depth two regulatory variants—breast cancer risk SNP rs11055880 and leukemia risk-associated SNP rs12142375—and demonstrate their endogenous regulatory activities on expression of ATF7IP and PDE4B genes, respectively, using a CRISPR-Cas9 approach.ConclusionsBy identifying regulatory variants associated with cancer susceptibility and studying their molecular functions, we hope to help the interpretation of GWAS results and provide improved information for cancer risk assessment.
Highlights
Most cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts
A modified STARR-seq strategy to detect regulatory variants associated with cancer susceptibility To detect regulatory variants associated with cancer risk, we focused on the 996 GWAS hits for cancer susceptibility and drug response catalogued in NHGRI up to 2013 [1]
As causal SNPs could be in linkage disequilibrium (LD) with a SNP reported in the GWAS catalogue [7], we included 10,673 SNPs that were in high LD (r2 > 0.8) with the 996 reported SNPs (Additional file 1: Figure S1a)
Summary
Most cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts. Annotating cancer susceptibility SNPs using chromatin states, sequence motifs, and eQTL sites can help prioritize variants for further assessment on their functional consequences [14, 15] To validate these predictions on a large scale, high-throughput experimental approaches to directly quantify their regulatory effects are urgently needed. At an even larger scale, the self-transcribing active regulatory region sequencing (STARR-seq) approach allows for directly measuring the activities of millions of enhancers by using testing sequences as their own reporters, taking advantage of the position-independent property of enhancers [18, 19] These methods have the potential to be adopted for direct testing of regulatory SNPs. Recently, two groups have reported direct identification of expression-modulating variants associated with GWAS traits using modified MPRAs [20, 21]. They synthesized tens of thousands of DNA elements containing both alleles of each SNP to recapture the variants in a population to test by MPRA, with increased numbers of barcodes for each variant to improve the sensitivity and reproducibility [20, 21]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.