Abstract

Breast Cancer (BCa) genome-wide association studies revealed allelic frequency differences between cases and controls at index single nucleotide polymorphisms (SNPs). To date, 71 loci have thus been identified and replicated. More than 320,000 SNPs at these loci define BCa risk due to linkage disequilibrium (LD). We propose that BCa risk resides in a subgroup of SNPs that functionally affects breast biology. Such a shortlist will aid in framing hypotheses to prioritize a manageable number of likely disease-causing SNPs. We extracted all the SNPs, residing in 1 Mb windows around breast cancer risk index SNP from the 1000 genomes project to find correlated SNPs. We used FunciSNP, an R/Bioconductor package developed in-house, to identify potentially functional SNPs at 71 risk loci by coinciding them with chromatin biofeatures. We identified 1,005 SNPs in LD with the index SNPs (r2≥0.5) in three categories; 21 in exons of 18 genes, 76 in transcription start site (TSS) regions of 25 genes, and 921 in enhancers. Thirteen SNPs were found in more than one category. We found two correlated and predicted non-benign coding variants (rs8100241 in exon 2 and rs8108174 in exon 3) of the gene, ANKLE1. Most putative functional LD SNPs, however, were found in either epigenetically defined enhancers or in gene TSS regions. Fifty-five percent of these non-coding SNPs are likely functional, since they affect response element (RE) sequences of transcription factors. Functionality of these SNPs was assessed by expression quantitative trait loci (eQTL) analysis and allele-specific enhancer assays. Unbiased analyses of SNPs at BCa risk loci revealed new and overlooked mechanisms that may affect risk of the disease, thereby providing a valuable resource for follow-up studies.

Highlights

  • Apart from a few examples of genetic mutations with high penetrance, such as found in BRCA1 & 2 genes [1], most genetic risk of breast cancer (BCa) resides at multiple low penetrance loci, more recently identified by genome-wide association studies (GWASs) [2]

  • Index single nucleotide polymorphisms (SNPs) such as rs11571833 (Lys3326Term in BRCA2 gene) [15] seem to be involved in known genetic mechanism of breast cancer tumorigenesis [1], the mechanisms for most of the other index SNPs are hidden. These index SNPs are most likely surrogates of many other SNPs in linkage disequilibrium (LD), since most of the GWAS arrays were designed based on the Hapmap data to capture a large fraction of common genetic variation [27]

  • When we extracted SNP data for Europeans from the 1000 genomes project released in May 2012 [28], we found 308,010 very low LD (0#r2,0.1), 11,438 low LD (0.1#r2,0.5), and 3,508 high LD (r2$0.5) SNPs at the 71 BCa risk loci (Fig. 1B)

Read more

Summary

Introduction

Apart from a few examples of genetic mutations with high penetrance, such as found in BRCA1 & 2 genes [1], most genetic risk of breast cancer (BCa) resides at multiple low penetrance loci, more recently identified by genome-wide association studies (GWASs) [2]. GWASs utilize single nucleotide polymorphisms (SNPs) to tag common genetic variation in linkage disequilibrium (LD) blocks in order to identify genome-wide risk loci for complex diseases. Due to this plethora of SNPs in LD, much of the heritability of complex diseases, such as BCa, remains unknown [17]. Identification of underlying mechanisms that explain how SNPs affect risk will provide a better understanding of the genetic risk of complex diseases, such as breast cancer, which is described in this study

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call