Abstract

Long non-coding RNAs (lncRNAs) play crucial roles in human physiology, and have been found to be associated with various cancers. Transcribed ultraconserved regions (T-UCRs) are a subgroup of lncRNAs conserved in several species, and are often located in cancer-related regions. Breast cancer is the most common cancer in women worldwide and the leading cause of female cancer deaths. We investigated the association of genetic variants in lncRNA and T-UCR regions with breast cancer risk to uncover candidate loci for further analysis. Our focus was on low-penetrance variants that can be discovered in a large dataset. We selected 565 regions of lncRNAs and T-UCRs that are expressed in breast or breast cancer tissue, or show expression correlation to major breast cancer associated genes. We studied the association of single nucleotide polymorphisms (SNPs) in these regions with breast cancer risk in the 122970 case samples and 105974 controls of the Breast Cancer Association Consortium’s genome-wide data, and also by in silico functional analyses using Integrated Expression Quantitative trait and in silico prediction of GWAS targets (INQUISIT) and expression quantitative trait loci (eQTL) analysis. The eQTL analysis was carried out using the METABRIC dataset and analyses from GTEx and ncRNA eQTL databases. We found putative breast cancer risk variants (p < 1 × 10–5) targeting the lncRNA GABPB1-AS1 in INQUISIT and eQTL analysis. In addition, putative breast cancer risk associated SNPs (p < 1 × 10–5) in the region of two T-UCRs, uc.184 and uc.313, located in protein coding genes CPEB4 and TIAL1, respectively, targeted these genes in INQUISIT and in eQTL analysis. Other non-coding regions containing SNPs with the defined p-value and highly significant false discovery rate (FDR) for breast cancer risk association were discovered that may warrant further studies. These results suggest candidate lncRNA loci for further research on breast cancer risk and the molecular mechanisms.

Highlights

  • About 70–90% of the human genome is transcribed (Guttman et al, 2009; Mercer et al, 2009)

  • We look into the breast cancer risk association of single nucleotide polymorphisms (SNPs) on long non-coding RNAs (lncRNAs) expressed in mammary tissue or associated with known breast cancer risk genes, as well as SNPs located at the T-ultraconserved regions (UCRs)

  • We looked into the breast cancer risk association of SNPs in the regions of breast cancer-relevant lncRNAs and of T-UCRs around the genome in a large cohort of European breast cancer patients

Read more

Summary

Introduction

About 70–90% of the human genome is transcribed (Guttman et al, 2009; Mercer et al, 2009). These include long non-coding RNAs (lncRNAs), defined as ncRNAs with over 200 nucleotides They participate lncRNA SNPs in Breast Cancer in various biological processes, including differentiation, immune response and metabolism (Kretz et al, 2013; Hung et al, 2014; Wang et al, 2014) as well as in pathogenic processes, such as the development and progression of cancer (Gupta et al, 2010; Yang et al, 2013; Xing et al, 2014). The study of T-UCR expression is complicated: based on annotation compiled by Mestdagh et al (2010), 38.7% of the 481 T-UCRs are intergenic and 57.4% of the 481 T-UCRs are located in protein coding genes (42.6% intronic, 4.2% exonic, 5% partly exonic, and 5.6% exon containing), and 3.9% of T-UCRs lack an explicit gene-related annotation, because of the host gene splice variants. Many of the T-UCRs are located in cancer-related regions and fragile sites, and their expression is frequently altered in human cancer (Amos et al, 2017; Fabris and Calin, 2017; Terracciano et al, 2017)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call