Genome-wide Association Studies Hits Research Articles

Abstract Background: Though several genome-wide association studies (GWAS) of breast cancer (BC) have identified common variants which differ between intrinsic subtypes, the genes through which these variants act through to impact BC risk have not been fully established. Furthermore, transcriptome-wide association studies (TWAS) have thus far been primarily employed to identify genes associated with overall BC risk, while overlooking how the influence of common variation on gene expression may contribute to subtype-specific differences. Methods: In this study, we performed two complementary multi-tissue TWASs for each of the following BC intrinsic subtypes: Luminal A-like, Luminal B-like, Luminal B/HER2-negative-like, HER2-enriched-like, Triple Negative BC. These two approaches included 1) an expression-based approach that collated TWAS signals from expression quantitative trait loci (eQTLs) across multiple tissues using the Aggregated Cauchy Association Test (ACAT) and 2) a splicing-based approach that utilized two applications of ACAT to collate splicing QTLs (sQTLs) for a given gene in a tissue and then across tissues. To perform these two TWASs, we utilized e/sQTL models trained in 11 tissues from the Genotype-Tissue Expression Project including breast, ovary, uterus, vagina, EBV-transformed lymphocytes, whole blood, spleen, liver, subcutaneous adipose, visceral adipose, and cell-cultured fibroblasts. GWAS summary statistics were previously generated from 133,384 BC cases and 113,789 controls who were participants in the Breast Cancer Association Consortium (BCAC). We further performed our TWAS while conditioning e/sQTL effect sizes on nearby GWAS index SNPs. Additionally, we utilized gene-based fine-mapping of eQTLs and sQTL to identify candidate causal genes for each intrinsic subtype. Results: Overall, we identified 164 genes in 69 loci that were associated with Luminal A-like, 19 genes in 9 loci with Luminal B-like, 18 genes in 11 loci with Luminal B/HER2-negative-like, 10 genes in 7 loci with HER2-enriched-like, and 29 genes in 12 loci with TNBC. Among these genes, 17 genes had not been reported in previous TWAS of BC, and 140 genes, 1 gene, 2 genes, 2 genes, and 16 genes were uniquely associated with each of the intrinsic subtypes, respectively. Additionally, we identified one gene associated with Luminal A-like and one gene with Luminal B-like BC that were each in a locus located at least 1.4 Mb from published GWAS hits. Furthermore, we identified 106, 11, 10, 5, and 21 candidate causal genes for each of the intrinsic subtypes, respectively, that had a posterior inclusion probability of 0.9 in at least one e/sQTL. Conclusion: In summary, our multi-tissue TWAS corroborated previous GWAS loci for overall BC risk and intrinsic subtypes, while underscoring how common variation impacts BC etiology by modulating the expression and splicing of genes in multiple tissue types. Citation Format: James Li, Julian McClellan, Haoyu Zhang, Guimin Gao, Dezheng Huo. Multi-tissue transcriptome-wide association studies identified genes for intrinsic subtypes of breast cancer [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO5-09-03.

Maize is known for its phenotypic and genetic diversity. On average, two maize lines diverge from one another as much as humans do from chimpanzees (Buckler et al., 2006). This diversity is attributed to pollen flow between domesticated maize and its wild relative teosinte, as well as to trading between farmers (Hake & Ross-Ibarra, 2015). Maize diversity contributes to its adaptability to new climates and growth conditions so that, currently, maize is grown across a wider area than any other crop (Hake & Ross-Ibarra, 2015). Analysing maize populations based on DNA sequence polymorphisms (markers) is the premise for identifying selection targets and understanding geographic relations. Many studies (Brandenburg et al., 2017; Chia et al., 2012) have revealed changes in genetic diversity by re-sequencing maize lines with differing sequencing depth, and with samples from across America and Europe. However, comparing markers between datasets of different studies can be challenging: datasets can differ in allele frequencies, in the single nucleotide polymorphism (SNP)-calling pipelines, or in stochastic distributions of read depths, and therefore might identify different markers even in the same genomic region. Therefore, Marcin Grzybowski, James Schnable and team set out to unify previous datasets as well as incorporate newly re-sequenced samples in one dataset (Grzybowski et al., 2023). Schnable obtained his PhD from the University of California–Berkeley, working on plant comparative genomics. During doing lab rotations, he got hooked on computational biology because this approach leads to much faster results than wet-lab experiments. After a short postdoc at the Chinese Academy of Agricultural Sciences in Beijing, he was hired as a Professor at the University of Nebraska–Lincoln to work on computational biology. He soon got involved in a new plant phenotyping platform being started there, and brought together teams of biologists, computer scientists and statisticians to work with the platform. “It's been a lot of fun”, he says! For the diversity dataset, Grzybowski and colleagues used whole genome resequencing data from 1515 maize samples, comprising lines from the Wisconsin Diversity panel (Hansey et al., 2011), inbred lines from Poland, as well as wild relatives, tropical landraces, archaeological samples and modern open-pollinated varieties. Overall, these samples originated in or were developed over six continents (Figure 1a). The sequence data were aligned to the maize reference genome, and over 350 million potential DNA sequence polymorphisms were identified. The dataset is therefore much bigger than the approximately 83 million variants identified in the maize HapMap3 project, which included over 1200 maize accessions, but used a lower sequencing depth (Bukowski et al., 2018). Second-stage quality filtering of their new dataset reduced the number of variants to approximately 46 million higher confidence variants. Marker dataset for global maize diversity can be used to identify new genes associated with a trait. (a) Geographical distribution of the countries of origin for the 1515 maize individuals used in the study. (b) Association test using the marker set defined in this study identified MADS69 and ZCN8 based on female flowering data (days to silking). Adapted from Grzybowski et al. (2023). Population genetic analyses are not only based on DNA sequence diversity, but also require the analysis of phenotypic traits. However, comparing phenotypes across different environments adds more variance and thereby reduces the statistical power to link genotype and phenotype. Comparing genotypes within the same environment is desirable, but not all genotypes are adapted to the same environments and can complete their life cycle. To tackle this problem, researchers use association panels, which maximize genetic diversity by selecting genotypes adapted to a specific environment. Grzybowski and colleagues used their marker set to analyse the Wisconsin Diversity Panel (Hansey et al., 2011) and found that the lines retained over 70% DNA sequence variation compared to the wild relative Zea mays ssp. parviglumis, suggesting that there is still a lot of genetic variation in this diversity panel. Therefore, the marker dataset can serve as a resource for other researchers to calculate accurate values of DNA sequence diversity for their populations. To analyse the impact of the high marker density on the outcomes for genome-wide association studies (GWAS), Grzybowski and colleagues used a published set of female flowering data generated from temperate-adapted maize inbreds (Mural et al., 2022). A previous GWAS using around 400 000 RNA-sequencing-based markers identified the flowering time gene MADS69 (Mural et al., 2022). With the newly generated marker set, Grzybowski and colleagues identified both MADS69 and a new locus, ZCN8, a gene that contributes (Guo et al., 2018) to maize adaption to temperate climates (Figure 1b). Grzybowski and colleagues speculate that ZCN8 was previously not discovered because the dataset used RNA-sequencing-based genetic markers and therefore missed significant SNPs in the intergenic space. Grzybowski and colleagues are hopeful that the higher density of markers will also help to achieve more precise localization of the causal variants associated with specific GWAS hits. Because of the diversity range of the maize lines used, this dataset can also be used to detect selection patterns in the genome associated with traits of interest; for example, those related to domestication, adaptation to the environment or genetic improvement during breeding. In the case of MADS69, Grzybowski and colleagues found less DNA sequence diversity in the promoter in tropical and temperate maize lines than in the wild relative. For ZCN8, they confirmed previous studies that had shown less DNA sequence diversity in domesticated maize than in teosinte (Guo et al., 2018) and, additionally, they showed that the decline was biggest between temperate and tropical domesticated maize populations, whereas tropical maize diversity was similar to that of teosinte, suggesting that ZCN8 was selected during adaptation to temperate conditions rather than during domestication. Grzybowski, who is currently a senior researcher at the University of Warsaw, hopes to use the dataset to identify genes underlying maize adaptation to cold and understand their evolutionary history, a topic that relates back to his PhD studies on maize adaptations to cold spring conditions. Including global maize diversity in the dataset, instead of focusing on a single GWAS population adapted to a single environment, will make it a lot easier for researchers to track the evolutionary histories and impact of selection on functional variants once they have been identified in a GWAS study.

Genome-wide Association Studies Hits Research Articles

Related Topics

Articles published on Genome-wide Association Studies Hits

Abstract PO5-09-03: Multi-tissue transcriptome-wide association studies identified genes for intrinsic subtypes of breast cancer

Genetic regulatory effects in response to a high-cholesterol, high-fat diet in baboons

Machine Learning Models of Polygenic Risk for Enhanced Prediction of Alzheimer Disease Endophenotypes.

Replication of previous autism-GWAS hits suggests the association between NAA1, SORCS3, and GSDME and autism in the Han Chinese population

Charting the genetic architecture of Alzheimer’s disease across APOE*4 and sex

Systematic differences in discovery of genetic effects on gene expression and complex traits.

Persistent homology analysis of type 2 diabetes genome-wide association studies in protein-protein interaction networks.

Alzheimer’s disease-associated complement gene variants influence plasma complement protein levels

Association of Polygenic Score With Tumor Molecular Subtypes in Papillary Thyroid Carcinoma.

Multiset correlation and factor analysis enables exploration of multi-omics data

A transcriptome association study of AD risk genes in the pathogenesis of Alzheimer’s disease

A joint transcriptome-wide association study across multiple tissues identifies candidate breast cancer susceptibility genes

What does heritability of Alzheimer's disease represent?

Design and quality control of large-scale two-sample Mendelian randomization studies.

Deletion mapping of regulatory elements for GATA3 in T cells reveals a distal enhancer involved in allergic diseases

Embracing diversity: a genetic marker dataset with increased marker density facilitates association studies in maize

Complement receptor 1 is expressed on brain cells and in the human brain.

PALM: a powerful and adaptive latent model for prioritizing risk variants with functional annotations.

HTRX: an R package for learning non-contiguous haplotypes associated with a phenotype.

Self-supervised graph representation learning integrates multiple molecular networks and decodes gene-disease relationships

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Genome-wide Association Studies Hits Research Articles

Related Topics

Articles published on Genome-wide Association Studies Hits

Abstract PO5-09-03: Multi-tissue transcriptome-wide association studies identified genes for intrinsic subtypes of breast cancer

Genetic regulatory effects in response to a high-cholesterol, high-fat diet in baboons

Machine Learning Models of Polygenic Risk for Enhanced Prediction of Alzheimer Disease Endophenotypes.

Replication of previous autism-GWAS hits suggests the association between NAA1, SORCS3, and GSDME and autism in the Han Chinese population

Charting the genetic architecture of Alzheimer’s disease across APOE*4 and sex

Systematic differences in discovery of genetic effects on gene expression and complex traits.

Persistent homology analysis of type 2 diabetes genome-wide association studies in protein-protein interaction networks.

Alzheimer’s disease-associated complement gene variants influence plasma complement protein levels

Association of Polygenic Score With Tumor Molecular Subtypes in Papillary Thyroid Carcinoma.

Multiset correlation and factor analysis enables exploration of multi-omics data

A transcriptome association study of AD risk genes in the pathogenesis of Alzheimer’s disease

A joint transcriptome-wide association study across multiple tissues identifies candidate breast cancer susceptibility genes

What does heritability of Alzheimer's disease represent?

Design and quality control of large-scale two-sample Mendelian randomization studies.

Deletion mapping of regulatory elements for GATA3 in T cells reveals a distal enhancer involved in allergic diseases

Embracing diversity: a genetic marker dataset with increased marker density facilitates association studies in maize

Complement receptor 1 is expressed on brain cells and in the human brain.

PALM: a powerful and adaptive latent model for prioritizing risk variants with functional annotations.

HTRX: an R package for learning non-contiguous haplotypes associated with a phenotype.

Self-supervised graph representation learning integrates multiple molecular networks and decodes gene-disease relationships