Introduction: An imbalance in gene expression of the different alleles of a gene (allele-specific expression or ASE) is an intriguing genome-wide and dynamic phenomenon that could have an important role in tumorigenesis. Little is known about the causes and consequences of the differential cis- and trans-regulation on resultant ASE, aside from the impact of copy number alterations (CNAs) or imprinting. Acute myeloid leukemia (AML) is an aggressive clonal hematopoietic stem cell malignancy, with a low mutational burden in comparison with other cancers, where the impact of ASE may constitute a novel mechanism of pathogenesis. Aim: Application of Whole Genome Sequencing (WGS), RNA-seq and Oxford Nanopore methylation sequencing to study the ASE landscape in a cohort of poor risk AML patients and shed light on the mechanisms triggering this phenomenon as well as its temporal dynamics. Methods: ASE landscape was explored using paired germline and tumour WGS and RNA-seq on diagnostic bone marrow or peripheral blood samples from 33 cytogenetically poor risk AML patients, and additional RNA-seq from 18 samples corresponding to follow-up timepoints (including remission, relapse and refractory samples) from eight of the patients.Heterozygous SNPs were identified in the genomic DNA, and allele frequencies in RNA-seq reads were used to quantify ASE, after filtering for CNAs in the tumor DNA. Cas9-enriched Oxford Nanopore sequencing was then performed on available patient samples at diagnosis and subsequent time points. Allele-specific methylation was determined on the Nanopore sequencing using Guppy v6.3. Results: In the diagnostic cohort, we identified 4,802 protein-coding genes exhibiting ASE in at least one sample, with a median of 169 genes per sample (range 119-468, with one outlier sample showing ASE in 1,705 genes). Overall, 283 protein-coding genes showed ASE in ≥ 20% patients across the 33 samples. These genes included known imprinted genes and the transcription factor GATA2, where ASE was previously associated with AML, validating our ASE pipeline and filtering criteria. We also identified for the first time recurrent ASE in other myeloid genes such as KMT2A (20%), the homeobox genes PBX2 (41%), PBX3 (20%), HOXA6 (20%) and HOXB cluster ( HOXB2 (31%), HOXB3 (27%), HOXB5 (40%), HOXB6 (30%)); and the myeloid differentiation genes CDKN2A (40%), PDGFRB (25%) and PRTN3 (57%). Eighteen primary AML samples from 8/33 patients had paired diagnostic, remission and relapse or refractory material, allowing us to track the dynamic ASE pattern. RNA-seq and Sanger sequencing of cDNA in these samples showed that the ASE observed at diagnosis in HOXB2, HOXB3, HOXB6, PBX3 and GATA2 was lost in the corresponding remission sample (bi-allelic expression), with ASE returning at relapse and maintained in the refractory samples. In order to explore the epigenetic mechanisms underlying these allele-specific changes in expression, Cas9-enriched Oxford Nanopore sequencing was performed in 2 patients with available DNA at diagnosis and sequential time points, to determine the pattern of allele-specific methylation of the promoter region of GATA2 (patient 1) and HOXB2 (patient 2). In patient 1 at diagnosis, 90% of CpG sites were methylated at the GATA2 promoter region on one allele versus 61% in the other allele, with the hypomethylated allele corresponding to the allele that is overexpressed in the ASE sample. However, in the remission sample, the proportion of methylated CpG sites on the two alleles was similar (83% and 73%, respectively), in agreement with the expression of both GATA2 alleles at this stage. Likewise, in the diagnostic sample of patient 2, 6% of CpG sites were methylated in the HOXB2 promoter region of one allele, corresponding to the overexpressed allele in the ASE sample, while the other promoter allele had 17% of CpG sites methylated; with no differences in CpG methylation between alleles in the remission samples (9% versus 9%), again correlating with the bi-allelic expression of HOXB2 at remission. Conclusions: Combining WGS and RNA-seq in AML primary samples, we have identified key leukemia genes showing recurrent dynamic ASE associated with disease evolution: since it is found at diagnosis, lost in remission, but recovered in relapse and persisting in refractory stages. Our results also suggest allele-specific methylation as the potential mechanism orchestrating this phenomenon.
Read full abstract