Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Genome-wide association mapping within a local Arabidopsis thaliana population more fully reveals the genetic architecture for defensive metabolite diversity.

  • Abstract
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

A paradoxical finding from genome-wide association studies (GWAS) in plants is that variation in metabolite profiles typically maps to a small number of loci, despite the complexity of underlying biosynthetic pathways. This discrepancy may partially arise from limitations presented by geographically diverse mapping panels. Properties of metabolic pathways that impede GWAS by diluting the additive effect of a causal variant, such as allelic and genetic heterogeneity and epistasis, would be expected to increase in severity with the geographical range of the mapping panel. We hypothesized that a population from a single locality would reveal an expanded set of associated loci. We tested this in a French Arabidopsis thaliana population (less than 1 km transect) by profiling and conducting GWAS for glucosinolates, a suite of defensive metabolites that have been studied in depth through functional and genetic mapping approaches. For two distinct classes of glucosinolates, we discovered more associations at biosynthetic loci than the previous GWAS with continental-scale mapping panels. Candidate genes underlying novel associations were supported by concordance between their observed effects in the TOU-A population and previous functional genetic and biochemical characterization. Local populations complement geographically diverse mapping panels to reveal a more complete genetic architecture for metabolic traits.This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.

Similar Papers
  • Front Matter
  • Cite Count Icon 3
  • 10.4065/mcp.2011.0337
Genome-Wide Association Studies Go Green: Novel and Cost-Effective Opportunities for Identifying Genetic Associations
  • Jul 1, 2011
  • Mayo Clinic Proceedings
  • Celine M Vachon

Genome-Wide Association Studies Go Green: Novel and Cost-Effective Opportunities for Identifying Genetic Associations

  • Research Article
  • Cite Count Icon 18
  • 10.1176/appi.ajp.2010.10030465
Genome-Wide Association Studies: Does Only Size Matter?
  • Jul 1, 2010
  • American Journal of Psychiatry
  • Sharon Schwartz + 1 more

Genome-Wide Association Studies: Does Only Size Matter?

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.1371/journal.pone.0205564
Assessment of heterosis in two Arabidopsis thaliana common-reference mapping populations.
  • Oct 12, 2018
  • PLOS ONE
  • Marieke H A Van Hulten + 10 more

Hybrid vigour, or heterosis, has been of tremendous importance in agriculture for the improvement of both crops and livestock. Notwithstanding large efforts to study the phenomenon of heterosis in the last decades, the identification of common molecular mechanisms underlying hybrid vigour remain rare. Here, we conducted a systematic survey of the degree of heterosis in Arabidopsis thaliana hybrids. For this purpose, two overlapping Arabidopsis hybrid populations were generated by crossing a large collection of naturally occurring accessions to two common reference lines. In these Arabidopsis hybrid populations the range of heterosis for several developmental and yield related traits was examined, and the relationship between them was studied. The traits under study were projected leaf area at 17 days after sowing, flowering time, height of the main inflorescence, number of side branches from the main stem or from the rosette base, total seed yield, seed weight, seed size and the estimated number of seeds per plant. Predominantly positive heterosis was observed for leaf area and height of the main inflorescence, whereas mainly negative heterosis was observed for rosette branching. For the other traits both positive and negative heterosis was observed in roughly equal amounts. For flowering time and seed size only low levels of heterosis were detected. In general the observed heterosis levels were highly trait specific. Furthermore, no correlation was observed between heterosis levels and the genetic distance between the parental lines. Since all selected lines were a part of the Arabidopsis genome wide association (GWA) mapping panel, a genetic mapping approach was applied to identify possible regions harbouring genetic factors causal for heterosis, with separate calculations for additive and dominance effects. Our study showed that the genetic mechanisms underlying heterosis were highly trait specific in our hybrid populations and greatly depended on the genetic background, confirming the elusive character of heterosis.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.atherosclerosis.2021.05.001
Systematic review of genome-wide association studies of abdominal aortic aneurysm
  • May 12, 2021
  • Atherosclerosis
  • Tejas P Singh + 4 more

Systematic review of genome-wide association studies of abdominal aortic aneurysm

  • Research Article
  • 10.1158/1538-7445.am2018-226
Abstract 226: Single variant and gene-based replication analysis of reproductive aging in African American women in the AMBER Consortium
  • Jul 1, 2018
  • Cancer Research
  • Marie V Coignet + 9 more

The two main hormonal events of a woman's life, menarche and menopause, have a paramount impact on the duration of exposure to estrogen. Reproductive aging phenotypes, including age at menarche (AM) and age at natural menopause (ANM) have been consistently associated with breast cancer risk. Despite an estimated strong genetic component, genome-wide association studies (GWAS) for AM and ANM found that common variants identified to date account for only 7.4% for SNPs related to AM and 2.5-4.1% for ANM. As most previous GWAS on AM and ANM were conducted in women of European ancestry (EA), studies examining genetic components of reproductive aging in African-American (AA) women are needed. We hypothesize that although the index GWAS variants discovered in EA women may differ from those in AA women, rare and low-frequency causal variants may reside in the same genetic regions. A candidate analysis of previously identified GWAS variants and genes in association with AM, ANM was conducted in the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium. All SNPs within a 500kb window of previously discovered GWAS SNPs for AM and ANM were extracted from the Illumina Human Exome Beadchip v1.1, leading to 1,505 candidate SNPs from 125 genes for AM and 1,198 candidate SNPs from 35 genes for ANM in a total of 7,886 AA subjects. Single SNP association tests were run in PLINK using linear regressions for the continuous trend test for AM/ANM and logistic regression for the extreme AM phenotype (<11 v. >=≥15 years). The SKAT-O test for the gene-based analyses was performed using the SKAT R package to aggregate variants with an MAF upper bound of 5%. The top variants related to AM were two SNPs, rs314277 (β=0.11, MAF= 0.45, p=6.24E-05) and rs4742314 (β= -0.11, MAF= 0.39, p=6.44E-05), located in LIN28B and KDM4C respectively. rs974828 (RORA, MAF= 0.23, OR=0.71, p=0.0003) and rs314277 (p=0.0007) were found to be the top variants in association with the extreme AM phenotype. For ANM, rs16991615, located in MCM8, was the most significant variant associated with increased ANM ((β=2.06, MAF= 0.01, p=0.0005). rs314277 (LIN28B) has been previously associated with AM, and rs16991615 (MCM8) had also been related to ANM in previous GWAS in EAs. In gene-based analysis for AM, SLC38A3 (p=0.0007) and WDR6 (p= 0.003) were nominally significant; and EPS8L1 (p= 0.005) and RBM6 (p= 0.01) were associated with the AM phenotype. For ANM, RBMS2 (p= 0.006) was nominally significant in gene-based analysis. This is to date the largest study in AA women for reproductive life events to interrogate rare and low-frequency variants, which are beyond the spectrum of common variants in previous GWAS. Although the overall replication success rate is low, our analyses identified several rare and low-frequency variants in regions from previous GWAS. Our data contributed to the literature on genetic variation for reproductive aging in AA women. Citation Format: Marie V. Coignet, Qianqian Zhu, David G. Cox, Kathryn Lunetta, Elisa V. Bandera, Christopher Haiman, Andrew Olshan, Julie Palmer, Christine Ambrosone, Song Yao. Single variant and gene-based replication analysis of reproductive aging in African American women in the AMBER Consortium [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 226.

  • Research Article
  • 10.1093/neuonc/noac174.183
P10.18.A Replication of previous GWAS identifies TERT and near EGFR SNVs as risk factors in EPIC glioma patients: a nested case-control study
  • Sep 5, 2022
  • Neuro-Oncology
  • W Wu + 5 more

Background Gliomas, the most common malignant primary brain tumors in adults, typically have a poor prognosis irrespective of medical care. Previous large genome-wide association studies (GWAS) have identified 27 single-nucleotide variants (SNVs) that are significantly associated with glioma. However, most of the GWAS are conducted by case-control study designs, it is therefore prone to bias when rapidly lethal cases don’t have chance to be included in the study. This study aims to replicate the previous GWAS findings using prospective study design. Material and Methods We conducted a nested case-control study within the European Prospective Investigation into Cancer (EPIC) cohort from 7 European countries. GSA-MD Infinium global screening array was used for genotyping. Some subjects were genotyped by other platforms previously. In total, 468 glioma patients and 481 controls were included. The genotypes of 27 SNVs were extracted and for ungenotyped SNVs, datasets were imputed using SHAPEIT v4.1.3 and IMPUTE5 v1.1.5 based on the Haplotype Reference Consortium (Release 1.1) reference panel. Conditional logistic regression model was used to investigate the additive effect of SNVs on the risk of glioma. Results 21 SNVs showed a consistent direction of effect with previous studies, whereas 6 SNVs did not (ORs between 0.72-0.99 and not significant). After adjusting for multiple testing, two SNVs, rs10069690 (TERT), and rs75061358 (near EGFR) were significantly associated with glioma risk. We observed that prominent OR (2.23, 95%CI=1.49-3.33) of rs75061358 in our study compared to the result from previous GWAS, which implied rs75061358 might be not only a risk factor but also affect survival. Different risk direction was observed for rs77633900 in ETFA gene (OR=0.72, 95%CI=0.51-1.01). Conclusion Our findings further confirmed the genetic role on the etiology of glioma in the European population. The potential biases from the previous GWAS are required to be elucidated.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1007/s00125-025-06420-8
The genetics of low and high birthweight and their relationship with cardiometabolic disease
  • Apr 10, 2025
  • Diabetologia
  • Gunn-Helen Moen + 4 more

Aims/hypothesisLow birthweight infants are at increased risk not only of mortality, but also of type 2 diabetes mellitus and CVD in later life. At the opposite end of the spectrum, high birthweight infants have increased risk of birth complications, such as shoulder dystocia, neonatal hypoglycaemia and obesity, and similarly increased risk of type 2 diabetes mellitus and CVD. However, previous genome-wide association studies (GWAS) of birthweight in the UK Biobank have primarily focused on individuals within the ‘normal’ range and have excluded individuals with high and low birthweight (<2.5 kg or >4.5 kg). The aim of this study was to investigate genetic variation associated within the tail ends of the birthweight distribution, to: (1) see whether the genetic factors operating in these regions were different from those that explained variation in birthweight within the normal range; (2) explore the genetic correlation between extremes of birthweight and cardiometabolic disease; and (3) investigate whether analysing the full distribution of birthweight values, including the extremes, improved the ability to detect genuine loci in GWAS.MethodsWe performed case–control GWAS analysis of low (<2.5 kg) and high (>4.5 kg) birthweight in the UK Biobank using REGENIE software (Nlow=20,947; Nhigh=12,715; Ncontrols=207,506) and conducted three continuous GWAS of birthweight, one including the full range of birthweights, one involving a truncated GWAS including only individuals with birthweights between 2.5 and 4.5 kg and a third GWAS that winsorised birthweight values <2.5 kg and >4.5 kg. Additionally, we performed bivariate linkage disequilibrium (LD) score regression to estimate the genetic correlation between low/normal/high birthweight and cardiometabolic traits.ResultsBivariate LD score regression analyses suggested that high birthweight had a mostly similar genetic aetiology to birthweight within the normal range (genetic correlation coefficient [rG]=0.91, 95% CI 0.83, 0.99), whereas there was more evidence for a separate set of genes underlying low birthweight (rG=−0.74, 95% CI 0.66, 0.82). Low birthweight was also significantly positively genetically correlated with most cardiometabolic traits and diseases we examined, whereas high birthweight was mostly positively genetically correlated with adiposity and anthropometric-related traits. The winsorisation strategy performed best in terms of locus detection, with the number of independent genome-wide significant associations (p<5×10−8) increasing from 120 genetic variants at 94 loci in the truncated GWAS to 270 genetic variants at 178 loci, including 27 variants at 25 loci that had not been identified in previous birthweight GWAS. This included a novel low-frequency missense variant in the ABCC8 gene, a gene known to be involved in congenital hyperinsulinism, neonatal diabetes mellitus and MODY, that was estimated to be responsible for a 170 g increase in birthweight amongst carriers.Conclusions/interpretationOur results underscore the importance of genetic factors in the genesis of the phenotypic correlation between birthweight and cardiometabolic traits and diseases.Graphical

  • Research Article
  • Cite Count Icon 10
  • 10.1200/jco.2017.35.6_suppl.1
Prostate cancer meta-analysis from more than 145,000 men to identify 65 novel prostate cancer susceptibility loci.
  • Feb 20, 2017
  • Journal of Clinical Oncology
  • Rosalind Eeles + 12 more

1 Background: Currently genome-wide association studies (GWAS) have identified over 100 prostate cancer (PrCa) susceptibility loci, capturing 33% of the PrCa familial relative risk (FRR) in Europeans. To identify further susceptibility variants, we conducted a PrCa GWAS, larger than previous studies, comprising ~49,000 cases and ~29,000 controls among individuals of European and Asian descent using the OncoArray, a platform consisting of a 260K GWAS backbone and 310K custom content selected from previous GWAS and fine-mapping studies of multiple cancers ( http://epi.grants.cancer.gov/oncoarray/ ). Methods: Genotypes from the OncoArray were used to impute genotypes from ~70M variants using the October 2014 release of the 1000 genomes project as a reference, and then combined with several previous PrCa GWAS of European ancestry: UK stage 1 (1,906 cases/1,934 controls) and stage 2 (3,888 cases/3,956 controls); CaPS 1 (498 cases/502 controls) and CaPS 2 (1,483 cases/519 controls); BPC3 (2,137 cases/3,101 controls); NCI PEGASUS (4,622 cases/2,954 controls); and iCOGS (21,209 cases/ 20,440 controls). Risk analyses for overall PrCa risk, aggressive PrCa (several definitions defined by PrCa clinical characteristics), and Gleason score were performed. Logistic and linear regression summary statistics were meta-analysed using an inverse variance fixed effect approach. Results: We identified novel loci significantly associated ( P &lt; 5.0x10-8) with overall PrCa (N = 65). Our novel findings are comprised of several missense variants, including a SNP in the ATM gene - a key member of the DNA repair pathway. When combined multiplicatively, the 65 novel PrCa loci identified here increases the captured heritability of PrCa, explaining 38.5% of the FRR when combining novel and previously identified PrCa loci. Conclusions: In risk stratification, men in the top 1% of the genetic risk score group have a relative risk of 5.6 fold for developing PrCa compared with the median risk group. These results will improve the utility of genetic risk scores for targeted screening and prevention for prostate cancer.

  • Dissertation
  • 10.18174/352650
Using natural variation to unravel the dynamic regulation of plant performance in diverse environments
  • Jan 1, 2015
  • J.A Molenaar

Summary All plants are able to respond to changes in their environment by adjusting their morphology and metabolism, but large differences are observed in the effectiveness of these responses in the light of plant fitness. Between and within species large differences are observed in plant responses to drought, heat and other abiotic stresses. This natural variation is partly due to variation in the genetic composition of individuals. Within-species variation can be used to identify and study genes involved in the genetic regulation of plant performance. Growth of the world population will, in the coming years, lead to an increased demand for food, feed and other natural products. In addition, extreme weather conditions with, amongst others, more and prolonged periods of drought and heat are expected to occur due to climate change. Therefore breeders are challenged to produce stress tolerant cultivars with improved yield under sub-optimal conditions. Knowledge about the mechanisms and genes that underlie tolerance to drought, heat and other abiotic stresses will ease this challenge. The aim of this thesis was to identify and study the role of genes that are underlying natural variation in plant performance under drought, salt and heat stress. To reach this goal a genome wide association (GWA) mapping approach was taken in the model species Arabidopsis thaliana. A population of 350 natural accessions of Arabidopsis, genotyped with 215k SNPs, was grown under control and several stress conditions and plant performance was evaluated by phenotyping one or several plant traits per environment. Genes located in the genomic regions that were significantly associated with plant performance, were studied in more detail. Plant performance was first evaluated upon osmotic stress (Chapter 2). This treatment resulted not only in a reduced plant size, but also caused the colour of the rosette leaves to change from green to purple-red due to anthocyanin accumulation. The latter was visually quantified and subsequent GWA mapping revealed that a large part of the variation in anthocyanin accumulation could be explained by a small genomic region on chromosome 1. The analysis of re-sequence data allowed us to associate the second most frequent allele of MYB90 with higher anthocyanin accumulation and to identify the causal SNP. Interestingly MYB75, a close relative of MYB90, was not identified by GWA mapping, although causal sequence variation of this gene for anthocyanin accumulation was identified in the Cvi x Ler and Ler x Eri-1 RIL populations. Re-sequence data revealed that one allele of MYB75 was dominating the population and that the MYB75 alleles of Cvi and Ler were both rare, explaining the lack of association at this locus in GWA mapping. For MYB90, two alleles were present in a substantial part of the population, suggesting balancing selection between them. Next, the natural population was exposed to short-term heat stress during flowering (Chapter 3). This short-term stress has a large impact on seed set, while it hardly affects the vegetative tissues. Natural variation for tolerance against the effect of heat on seed set was evaluated by measuring the length of all siliques along the inflorescence in both heat-treated and control plants. Because the flower that opened during the treatment was tagged, we could analyse the heat response for several developmental stages separately. GWA mapping revealed that the heat response before and after anthesis involved different genes. For the heat response before anthesis strong evidence was gained that FLC, a flowering time regulator and QUL2, a gene suggested to play a role in vascular tissue development, were causal for two strong associations. Furthermore, the impact of moderate drought on plant performance was evaluated in the plant phenotyping platform PHENOPSIS. Homogeneous drought was assured by tight regulation of climate cell conditions and the robotic weighing and watering of the pots twice a day. Because plant growth is a dynamic trait it was monitored over time by top-view imaging under both moderate drought and control conditions (Chapter 4 and 5). To characterise growth it was modelled with an exponential function. GWA mapping of temporal growth data resulted in the detection of time-dependent QTLs whereas mapping of model parameters resulted in another set of QTLs related to the entire growth period. Most of these QTLs would not have been identified if plant size had only been determined on a single day. For the QTLs detected under control conditions eight candidate genes with a growth-related mutant or overexpression phenotype were identified (Chapter 4). Genes in the support window of the drought-QTLs were prioritized based on previously reported gene expression data (Chapter 5). Additional validation experiments are needed to confirm causality of the candidate genes. Next, to search for genes that determine plant size across many environments, biomass accumulation in the natural population was determined in 25 different environments (Chapter 6). Joint analysis of these data by multi-environment GWA mapping resulted in the detection of 106 strongly associated SNPs with significant effects in 7 to 16 environments. Several genes involved in starch metabolism, leaf size control and flowering time determination were located in close proximity of the associated SNPs. Two genes, RPM1 and ACD6, were located in close proximity of SNPs with significant GxE effects. For both genes, alleles have been identified that increase resistance to bacterial infection, but that reduce biomass accumulation. The sign of the allelic effect is therefore dependent on the environmental conditions. Whole genome predictions revealed that most of the GxE interactions observed at the phenotypic level were not the consequence of strong associations with strong QxE effects, but of moderate and weak associations with weak QxE effects. Finally, in Chapter 7 I discuss the usefulness of GWA mapping in the identification of genes underlying natural variation in plant performance under drought, heat stress and a number of other environments. Strong associations were observed for both environment-specific as well as common plant performance regulators. Some choices in phenotyping and experimental design were crucial for our success, like evaluation of plant performance over time and simplification of the quantification of the phenotype. It is suggested that follow-up work should focus on the functional characterization of the causal genes, because such analyses would be helpful to identify pathways in which the causal genes are involved and to understand why sequence variation results in changes at the phenotype level. Although translation of the findings to applications in crops is challenging, this thesis contributes to the understanding of the genetic regulation of stress response and therefore will likely contribute to the development of stress tolerant and stable yielding crops.

  • Dissertation
  • 10.5353/th_b4852156
Genome-wide association study on colorectal cancer in the Hong Kong Chinese population
  • Jan 1, 2012
  • Siu-Chung Choi

Colorectal cancer (CRC) is the second most common cancer in Hong Kong. While high-penetrance germline mutations account for up to 6% of cases, much of the variation in genetic risk may be attributable to multiple low-penetrance variants. Previous genome wide association studies (GWAS) have identified a number of CRC susceptibility alleles in Caucasian populations. Our GWAS investigated the association between genetic variants with CRC risk in the Han Chinese population in Hong Kong. In Stage I, genomic DNA samples from 455 female Chinese CRC subjects were genotyped using the Illumina 610 Quad SNP chip. Association analysis was performed on 439 cases and 771 general population female controls recruited for a study on bone mineral density. Population stratification was examined through principal components analysis using EIGENSTRAT version 2.0. From the association results, 46 SNPs (Group 1) were selected for follow-up replication (Stage II), together with 10 SNPs (Group 2) from previous GWAS studies. Genomic DNA samples from 3,571 Chinese subjects were genotyped using Sequenom MassARRAY system. Association analysis was performed on 1,505 cases and 1,452 controls. 5 SNPs (rs835378, rs2652007, rs2139273, rs2139273 and rs9286410) exceeded the genome-wide significance level in stage I, although none replicated in Stage 2, suggesting genotyping error. Results from stage II showed that the three most significant SNP were among those selected from the previous studies, yet their significance levels in Stage I were very weak . None of the SNPs selected from Stage I was significant at p&lt;0.01 in Stage 2. Two composite scores of genetic susceptibility, one for each group of SNPs, were calculated in stage II genotype data, as the total number of high-risk alleles (according to the direction of effect in Stage I results or previous GWAS) present in an individual. Both composite scores were significantly associated with CRC risk in Stage 2 (Group 1, p=2.38 x 10-5, beta=0.046, SE=0.012; Group 2 p=1.06 x 10-7, beta=0.10, SE=0.019), suggesting that while we had insufficient power to confirm individual SNPs identified in our GWAS and the previous GWAS, these findings indicate that the SNP sets selected from Stage I results, as well as those selected from previous GWAS, contain SNPs with genuine effects on CRC risk. One SNP, rs10795668 (OR = 0.79 [CI] 95%:0.71 – 0.87 p=3.78 x 10-6), was significantly associated with CRC risk in Stage II after adjustment for multiple testing. Two further SNPs, rs6983267 and rs4939827, also achieved suggestive p-values in Stage II. All these SNPs were selected from previous GWAS in the Caucasian population, demonstrating that shared genetic factors operate for CRC in diverse populations.

  • Abstract
  • 10.1016/j.juro.2018.02.2253
MP70-09 IDENTIFICATION OF NINE NEW SUSCEPTIBILITY LOCI FOR PROSTATE CANCER IN THE JAPANESE POPULATION
  • Apr 1, 2018
  • The Journal of Urology
  • Ryo Takata

MP70-09 IDENTIFICATION OF NINE NEW SUSCEPTIBILITY LOCI FOR PROSTATE CANCER IN THE JAPANESE POPULATION

  • Research Article
  • Cite Count Icon 2
  • 10.1111/tpj.16163
Embracing diversity: a genetic marker dataset with increased marker density facilitates association studies in maize
  • Mar 1, 2023
  • The Plant Journal
  • Gwendolyn K. Kirschner

Embracing diversity: a genetic marker dataset with increased marker density facilitates association studies in maize

  • Research Article
  • Cite Count Icon 23
  • 10.1007/s00122-019-03528-5
Identification of quantitative trait loci for net form net blotch resistance in contemporary barley breeding germplasm from the USA using genome-wide association mapping.
  • Jan 3, 2020
  • Theoretical and Applied Genetics
  • Anil Adhikari + 4 more

Association mapping study conducted in a population of 3490 elite barley breeding lines from ten barley breeding programs of the USA identified 12 QTLs for resistance/susceptibility to net form of net blotch. Breeding resistant varieties is the best management strategy for net form of net blotch (NFNB) in barley (Hordeum vulgare L.) caused by Pyrenophora teres f. teres (Ptt). Several resistance QTL have been previously identified in barley via linkage mapping and genome-wide association studies (GWAS). A GWAS conducted in a collection of advanced breeding lines (n = 3490) representing elite germplasm from ten barley breeding programs of the USA identified 42 unique marker-trait associations (MTA) for NFNB resistance. The lines were genotyped with 3072 SNP markers and phenotyped with four Ptt isolates in controlled environment. The lines were used to construct 13 different GWAS panels. Efficient mixed model association method with principal components and kinship was used for GWAS. Significance threshold for MTA was set at a false discovery rate of 0.05. Two, eight, six, one and 25 MTA were identified in chromosomes 1H, 3H, 4H, 5H and 6H, respectively. Based on genetic positions and linkage disequilibrium, these MTA's correspond to two, three, two, one and four QTLs in chromosome 1H, 3H, 4H, 5H and 6H, respectively. A comparison with previous linkage and GWAS studies revealed several previously identified and novel QTLs. Moreover, different genomic regions were found to be responsible for NFNB resistance in two-row versus six-row germplasm. The germplasm-specific SNP markers with additive effects and allelic distribution is reported to facilitate breeders in selection of markers for MAS to introgress novel net blotch resistance.

  • Research Article
  • Cite Count Icon 24
  • 10.1093/jnci/djac087
Genetic Analysis of Lung Cancer and the Germline Impact on Somatic Mutation Burden.
  • May 2, 2022
  • Journal of the National Cancer Institute
  • Aurélie A G Gabriel + 32 more

BackgroundGermline genetic variation contributes to lung cancer (LC) susceptibility. Previous genome-wide association studies (GWAS) have implicated susceptibility loci involved in smoking behaviors and DNA repair genes, but further work is required to identify susceptibility variants.MethodsTo identify LC susceptibility loci, a family history-based genome-wide association by proxy (GWAx) of LC (48 843 European proxy LC patients, 195 387 controls) was combined with a previous LC GWAS (29 266 patients, 56 450 controls) by meta-analysis. Colocalization was used to explore candidate genes and overlap with existing traits at discovered susceptibility loci. Polygenic risk scores (PRS) were tested within an independent validation cohort (1 666 LC patients vs 6 664 controls) using variants selected from the LC susceptibility loci and a novel selection approach using published GWAS summary statistics. Finally, the effects of the LC PRS on somatic mutational burden were explored in patients whose tumor resections have been profiled by exome (n = 685) and genome sequencing (n = 61). Statistical tests were 2-sided.ResultsThe GWAx–GWAS meta-analysis identified 8 novel LC loci. Colocalization implicated DNA repair genes (CHEK1), metabolic genes (CYP1A1), and smoking propensity genes (CHRNA4 and CHRNB2). PRS analysis demonstrated that these variants, as well as subgenome-wide significant variants related to expression quantitative trait loci and/or smoking propensity, assisted in LC genetic risk prediction (odds ratio = 1.37, 95% confidence interval = 1.29 to 1.45; P < .001). Patients with higher genetic PRS loads of smoking-related variants tended to have higher mutation burdens in their lung tumors.ConclusionsThis study has expanded the number of LC susceptibility loci and provided insights into the molecular mechanisms by which these susceptibility variants contribute to LC development.

  • Discussion
  • Cite Count Icon 8
  • 10.1002/cac2.12317
Transcriptome‐wide association analysis identified candidate susceptibility genes for nasopharyngeal carcinoma
  • Jun 1, 2022
  • Cancer Communications
  • Yong‐Qiao He + 35 more

Nasopharyngeal carcinoma (NPC) is a common malignancy in East and Southeast Asia, especially in South China. The etiology of NPC has been linked to genetic susceptibility, Epstein-Barr virus (EBV) infection, and environmental factors. Accumulated evidence including multiple genome-wide association studies (GWASs) has revealed robust genetic predisposition of NPC. However, GWAS-identified genetic variants collectively account for only 8.2% of NPC heritability [1]. The underlying inherited predisposition is largely undetermined. The strongest genetic signal for NPC consistently hits the human leukocyte antigen (HLA) region on 6p21 [2]. However, the highly polymorphic nature and complicated long-range linkage disequilibrium (LD) in the HLA region particularly obscure the causal variants driving the association. In addition, most genetic variants located in introns or intergenic regions. The causal genes mediating genetic effects on NPC risk have rarely been ascertained by GWAS alone. Recently, transcriptome-wide association study (TWAS) has been proposed as an attractive approach to identify novel gene-trait associations and prioritize causal genes for complex traits [3]. By integrating GWAS and gene expression data, TWAS can effectively and economically assess associations between genetically predicted gene expression levels and disease risks in large populations. Hence, using the cis-regulated expression in addition to genetic variants to explore NPC susceptibility genes could be promising and reasonable for mechanistic and functional inference. Nevertheless, neither public data of nasopharyngeal tissue were available, nor TWAS for NPC had been conducted. Herein, we integrated genome and transcriptome data of 89 nasopharyngeal tumor tissues and investigated the associations between predicted gene expression levels and NPC risk using multicenter GWAS data involving 4506 NPC cases and 5384 cancer-free subjects (defined as controls) from South China. Given the close relationship between EBV infection and NPC, a cis-regulated expression weight matrix from EBV-transformed lymphocytes (n = 117) in the GTEx project was used for further evaluation. Study populations and detailed methodology are described in the Supplementary file of methods. We predicted the expression levels of 2505 and 2411 genes in the GWAS population by constructing the models for the prediction of gene expression in nasopharyngeal tissues (NP models) and EBV-transformed lymphocytes (lymphocyte models), respectively (Supplementary Table S1), and 377 genes overlapped (Supplementary Figure S1). Thirty-three genes were associated with NPC at a Bonferroni-corrected threshold, and all were located in the HLA region (Figure 1A). Among them, 11 of 13 previously reported genes were replicated. Our results were consistent with the studies focusing on the HLA region in South China [4, 5], where most of the reported genes available in TWAS were replicated. The predicted expression levels of ZFP57 (NP models), MICA (both models), and HLA-C (lymphocyte models) were significantly higher in cases than in controls, while the expression levels of MOG, HCG27, HLA-DQB1, HLA-H, HLA-U (NP models), HLA-F (both models), HLA-A, and HLA-DRB1 (lymphocyte models) were lower in cases than in controls. The two overlapping genes showed similar associations with NPC (HLA-F: Z score = -10.28 and -8.95; MICA: Z score = 7.82 and 6.60, for NP and lymphocyte models, respectively) (Supplementary Table S2). Interestingly, half of the previously reported genes belonged to HLA class I. Most of them showed lower levels of predicted expression in cases than in controls, possibly because EBV transcripts in NPC tumors were involved in the inhibition of HLA class I gene expression [6]. It is rational to assume that the low expression levels of these genes may affect the anti-EBV immune response in presenting peptides to cytotoxic T cells, facilitating immune evasion of tumor cells or EBV-mediated oncogenic action. TWAS-identified susceptibility genes and pathways for NPC. (A) Manhattan plot of TWAS in NP models and lymphocyte models. The blue lines represent the Bonferroni-corrected significance threshold. The red dots above or below the blue line represent the genes passed the Bonferroni threshold in the association analysis. The genes with green labels have been reported to be associated with NPC by previous genome-wide or candidate pathway association studies. The genes with black labels were newly identified as NPC susceptibility genes by our study. The genes in different chromosomes were exhibited in light and dark grey dots. (B) Expression quantitative trait locus analysis for the seven putative causal genes in the expression data of 89 nasopharyngeal tissue samples. The Kruskal-Wallis test was used to compare medians among three genotypes for most of the variants. In a certain homozygote group, the P values were recalculated using only the wild-type and heterozygous groups for the expression of MICD, HCG27 and HLA-DOB by excluding the groups with a sample size less than 5. (C) GO pathway enrichment analysis of NPC. (D) KEGG pathway enrichment analysis of NPC. "Gene Ratio" refers to the percentage of total significant genes in the given pathway. All 354 significant genes (P < 0.05) in TWAS were used in the enrichment analysis. Abbreviations: TWAS, Transcriptome-wide association analysis; NPC, Nasopharyngeal carcinoma; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes Although the significant signals consistently hit the HLA region, 22 additional genes not previously reported were identified in TWAS. Among them, the predicted expression levels of 9 genes were significantly higher in NPC cases than in controls, including HLA-DOB, HCG4B, RPL23AP1, HLA-J in NP models and HCG4, CCHCR1, STK19, C4B, IFITM4P in lymphocyte models, while 13 other genes showed significantly lower expression levels in cases than in controls, including HCP5, ZSCAN23, HCG4P11, HCG4P7, MICD, MICB-DT, SNHG32 in NP models and NOTCH4, C4A, HCG22, POU5F1, MICE, HLA-S in lymphocyte models (Figure 1A). We performed conditional analyses to determine whether the associations between predicted gene expression levels and NPC were influenced by the GWAS signals. After conditioning on the respective GWAS index SNP, the associations for HLA-DOB, NOTCH4, ZSCAN23, STK19, C4B, HLA-J, HLA-S, and MICB-DT remained significant. After conditioning on all previously reported SNPs, NOTCH4, HCG4, HCG22, POU5F1, HCG4B, HCG4P11, MICB-DT, STK19 and IFITM4P remained significant. It indicated that their associations were partially independent of the GWAS signals (Supplementary Table S3). Due to the complicated structure with high LD and co-expression networks in the HLA region, we conducted fine-mapping analyses to prioritize the causal genes. Using posterior inclusion probability (PIP) analysis, we prioritized 7 causal genes: MICA, HLA-DQB1, HLA-DOB, ZSCAN23, HCG27, MICD, and HLA-U. HLA-DOB, ZSCAN23, and MICD were newly identified as NPC susceptibility genes (Supplementary Table S4). Furthermore, we conducted expression quantitative trait locus (eQTL) analyses to identify whether the genetic variants could influence the expression levels of these genes. We found that individuals with relevant risk SNPs (the GWAS index SNPs) exhibited higher expression of HLA-DQB1, MICA, MICD and HLA-U, or lower expression levels of ZSCAN23, HCG27, and HLA-DOB. These results indicated that the risk alleles affected the expression levels of the causal genes (Figure 1B). Two HLA class II genes (HLA-DQB1 and HLA-DOB) were prioritized as causal genes. Both genes were associated with other virus-associated cancers, such as cervical cancer [7]. A comprehensive TWAS exploring genetic susceptibility for antiviral immune response using 7924 subjects from the UK Biobank cohort revealed that the genetic determinants for EBV infection were predominantly located on HLA class II genes. The most significant signals associated with the antibody level of BamHI Z EBV replication activator (ZEBRA) hit HLA-DQB1 [8]. HLA-DOB may impact viral clearance capacity and persistent infection of hepatitis B virus (HBV) and hepatitis C virus (HCV) [9]. Since EBV reactivation with elevated EBV DNA load or antibodies was observed at the preclinical phase of NPC, we hypothesized that HLA class II genes, especially HLA-DQB1and HLA-DOB, participate in the early stage of NPC tumorigenesis by influencing EBV infection. Besides, some identified pseudogenes, such as IFITM4P [10], may function by regulating their parental genes. However, their biological mechanisms remain unclear, and further researches are needed. Gene Ontology (GO) enrichment analysis confirmed that TWAS-identified genes (354 genes with P < 0.05) were enriched in the pathways of cell-mediated immune response, antigen processing and presentation (Figure 1C). Similarly, the top pathways annotated with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database focused on infection of herpes simplex virus type 1, human T-cell leukemia virus type 1, EBV, and autoimmune disorders such as graft-versus-host disease (Figure 1D). In summary, using a TWAS approach, we corroborated the central role of HLA genes in NPC susceptibility. Apart from HLA class I genes, we propose critical roles of HLA class II genes and other nonclassical HLA genes. Seven genes, including HLA-DQB1 and HLA-DOB, were prioritized as causal genes. Recent evidence indicated that these genes are pivotal in the metastable equilibrium between host and virus. Our findings provide additional evidence for a better understanding of the genetic etiology of NPC and clues to further advance this field. We thank the staffs from Sun Yat-sen University Cancer Center biorepository. We thank all the study participants and research staff who recruited participants and collected samples in this study. This study was funded by the National Key Research and Development Program of China (2021YFC2500400), the Basic and Applied Basic Research Foundation of Guangdong Province, China (2021B1515420007), Sino-Sweden Joint Research Programme (81861138006), the Science and Technology Planning Project of Guangzhou, China (201804020094), the Special Support Program for High-level Professionals on Scientific and Technological Innovation of Guangdong Province, China (2014TX01R201), National Natural Science Foundation of China (81973131, 81903395, 81803319, 82003520), National Science Fund for Distinguished Young Scholars of China (81325018). The authors have no potential conflicts of interest to declare. The Institutional Review Board of Sun Yat-sen University Cancer Center approved this study. Informed consent was obtained from all study participants. WHJ and YQH devised the project and the main conceptual ideas; YQH, WQX, DHL, and TMW wrote the original draft; DHL and TMW performed the computational analyses; TMW, DWY, CMD, and WLZ contributed to implementation of data processing and analyses; DWY, CMD, YL, WLZ, RWX, LL, HD, XT, YW, TZ, XZL, PFZ, XHZ, SDZ, YZH, MT, YZ, YC and JBZ contributed to the sample preparation; TMW and WLZ contributed to the RNA-seq quantification and quality control pipeline; ETC, ZZ, GH, SMC, QL, LF, YS, MLL, HOA, WY, and THL contributed to the interpretation of the results; YQH, WQX, and TMW revised and wrote the final version of the manuscript; verified the analytical methods; WHJ supervised the project. All authors read and approved the final manuscript. Methods and materials are available in the supplementary file. The datasets generated and used during the current study are available at Research Data Deposit (RDD) public platform (www.researchdata.org.cn) with the approval RDD number of RDDB2021406340. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Save Icon
Up Arrow
Open/Close
Setting-up Chat
Loading Interface