Introduction: Somatic mutations in blood stem cells can drive clonal expansion and result in clonal haematopoiesis (CH). CH is a precursor to myeloid malignancies, and is increasingly also recognised as a risk factor for non-malignant diseases. CH has been investigated in Europeans, but remains understudied in non-European populations. Here, we investigate the causes and consequences of CH in two large prospective cohort studies, specifically the Mexico City Prospective Study (MCPS) and UK Biobank (UKB). MCPS represents the largest CH study in a non-European population to date. Methods: Admixed Americans were identified from MCPS (n=136,401) while Europeans were identified from the UKB (n=419,228) participants. Somatic variant calling was performed using Mutect2 on whole exome sequencing (WES) data across a selected panel of genes to identify putative CH driver mutations. WES was performed at an average depth of 55x, and CH was defined using a variant allele frequency of ≥3%. Genome-wide association studies (GWAS) for CH were performed using imputed array data, and exome-wide association studies (ExWAS) and gene-level association analyses using WES data. Global/Continental-level ancestry was assigned based on peddy score ≥95% while local ancestry was inferred using RFMix (Ziyatdinov et al., 2022). Results: The most recurrently mutated genes in MCPS and UKB were DNMT3A, TET2, ASXL1, PPM1D, TP53, SF3B1 and SRSF2. The prevalence of CH increased progressively with age to approximately 8/22 (36%) detected as carriers at age 100 and above. CH was 40% more prevalent in age-matched Europeans (4.96%) compared to Admixed Americans (3.10%). Inter-population comparison revealed overall CH, DNMT3A-, TET2-, ASXL1-, PPM1D-, TP53-, JAK2-, and SRSF2-mutant CH to be more prevalent in UKB relative to MCPS (Figure A). Intra-population analysis of the Admixed American cohort further revealed that individuals with a higher fraction of European ancestry were at higher risk of overall CH, DNMT3A-, ASXL1-, and SRSF2-mutant CH, but not TET2-, PPM1D-, TP53-, and JAK2-mutant CH. These suggest differences in relative contribution of genetic, lifestyle or environment factors to specific CH genes. CH GWAS performed in Admixed Americans recapitulated previously reported variants in Europeans, and also identified novel, ancestry-specific variants associated with CH risk, including SNPs upstream of TCL1B (rs968294563: OR=1.79, P=2.01x10 -9; rs187319135: OR=1.85, P=2.69x10 -9). Notably, TCL1B variants were associated with an increased risk of TET2- and ASXL1-mutant CH (rs968294563: OR=3.17, P=3.82x10 -16 for TET2; OR=2.36, P=2.40x10 -3 for ASXL1), but a decreased risk of DNMT3A-mutantCH (rs968294563: OR=0.51, P=6.32x10 -4) (Figure B). The minor allele frequency (MAF) of the most common TCL1B variant in Admixed Americans was 1.23% but was virtually absent in Europeans. CH ExWAS can identify rare causal variants not captured by genotyping or imputation, and these variants may be in linkage disequilibrium with common variants detectable via GWAS. Indeed, ExWAS in Admixed Americans identified one rare SNP on the TCL1B promoter (rs774615666: OR=2.24, P=1.94x10 -8) that was associated with an increased risk of TET2-mutantCH (OR=4.19, P=2.77x10 -10) and a decreased risk of DNMT3A-mutantCH (OR=0.13, P=1.65x10 -4), mirroring our GWAS findings. The rs774615666 risk allele was >200-fold more common in Admixed Americans (MAF: 0.33%) compared to Europeans (MAF: 0.0011%) and was not in linkage disequilibrium with the previously reported TCL1A promoter CH risk SNP in Europeans (Weinstock et al., 2023). Meta-analyses were performed using 555,629 individuals, from both MCPS Admixed American CH GWAS and UKB European CH GWAS. Of the 11 loci reported for overall CH, one was novel, namely GACAT3/ CYRIA. Lastly, we investigated the phenotypic associations of CH in Admixed Americans. Gene-specific CH was associated with increased risk of death from haematological malignancies ( TET2, TP53, SF3B1, and SRSF2), other cancer types ( ASXL1, DNMT3A, and SRSF2) and cardiovascular diseases( DNMT3A, TP53). Conclusions: The substantial difference in CH prevalence between populations and the ancestry-specific genetic associations, demonstrate how the analysis of non-European cohorts can generate novel insights and highlights the importance of such analyses in advancing health equality amongst different human populations.
Read full abstract