Abstract

Abstract There is a critical need to identify factors driving aggressive breast cancer (BCa) that is more likely to develop in African American (AA) women who have a lower incidence BCa, but have a higher mortality from all invasive BCa subtypes compared to Caucasian (CA) women. The etiology driving these disparities has remained confounding though experimental evidence has shown a statistically significant biological difference stemming from an immunobiological gene signature in prostate, breast or colorectal tumor tissues. It is currently unknown if this signature can influence tumor initiation, progression and/or therapeutic intervention. Furthermore, there is a significant lack of a mechanistic understanding in the measured gene expression network signature between AA and CA BCa tumors. Using existing data from the NIH's Cancer Genome Atlas (TCGA), we correlated gene expression between AA and CA tumors with 63 genes containing ancestry informative missense mutations. The genes were selected from single nucleotide polymorphisms (SNPs) using the Affymetrix SNP 6 array. Briefly, SNPs with a minor allele frequency (MAF) greater than 0.3 in the HapMap's Yoruba population from Ibadan, Nigeria (YRI) as well as a MAF less than 0.05 in the Caucasian (CEU), Chinese (CHB) and Japanese (JPT) populations resulted in a list of 17,995 SNP candidates. Those SNPs in non-coding regions were removed resulting in 7,782 gene annotated SNPS. Finally, only those SNPs in exonic regions with protein coding mutations (non-synonymous SNPs) lead to the 63 gene list. To compare these genes with existing gene expression, admixture analysis was performed on the TCGA genotypes from the Affymetrix Genome-Wide Human SNP Array 6.0 using normal tissues. An initial independent list of 4,323 cross-platform comparable SNPs were chosen for admixture analysis. Each SNP for each individual was filtered for a birdseed confidence score <0.1. Using PLINK v1.07, an additional 828 SNPs were excluded from further analysis if they had a minor allele frequency (MAF) <0.05 or displayed deviation from the Hardy-Weinberg equilibrium (p < 0.00001). A total of 641 SNPs were used in admixture analysis from the Bayesian Markov Chain Monte Carlo method from the program Structure 2.3.4. Gene expression analysis was performed on tumors by stage (I to III) or subtype (Luminal A, Luminal B, HER2 positive or Triple Negative) between AA with CA patients. Though additional genes related to an immunobiological signature were in the list of 63 candidate genes (e.g. CD86 molecule and chemokine C-X-C motif receptor 6) only a statistically significant correlation between only one gene with the filtered SNP mutation list was found. This gene was the interleukin 6 signal transducer (IL6ST) with a SNP mutation in rs2228046 resulting in a switch from isoleucine to threonine. The functional consequence of this switch in breast cancer is unknown. The use of this approach may uncover additional tumor gene expression that is a result of ancestral variation in coding sequences. The functional consequences of these variations in normal and tumor cells may potentiate tumor initiation, progression and response to therapy. Future investigation of sequence specific variant analysis between CA and AA tumors and impact on additional gene as well as protein expression using the TCGA data is currently under investigation. Citation Format: John Tyson McDonald, Luisel Ricks-Santi. Ancestrally derived SNPs influence on gene expression in breast cancer data sets. [abstract]. In: Proceedings of the AACR Special Conference on Computational and Systems Biology of Cancer; Feb 8-11 2015; San Francisco, CA. Philadelphia (PA): AACR; Cancer Res 2015;75(22 Suppl 2):Abstract nr B2-14.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call