Background. Chronic lymphocytic leukaemia (CLL) is always preceded by preclinical clonal B-cell disorders such as monoclonal B-cell lymphocytosis (MBL). Pre-malignant clonal B-cells are identifiable in asymptomatic individuals, a minority of whom will progress to symptomatic CLL. While some people remain at pre-cancerous or asymptomatic early stage CLL, others experience clinically aggressive disease. There is currently no method to effectively stratify individuals at higher risk of developing symptomatic CLL. Despite recent advances in treatment options available, the disease remains incurable. Therefore, we need to expand our knowledge and improve current classification methods. Here, we propose a molecular classifier using whole genome sequencing (WGS). This classifier was obtained by performing genomic characterization in pre-malignancy / early stages of the disease and comparing the genomic subgroups with those of symptomatic CLL. Methods. Individuals with newly diagnosed monoclonal B-cell lymphocytosis and early asymptomatic chronic lymphocytic leukaemia were enrolled in the OXPLORED clinical trial (Oxford Pre-cancerous Lymphoproliferative Disorders: Analysis and Interception study). We performed WGS in 400 samples from 200 individuals (sorted CD19+ B-cells as tumour, and matched salivary DNA as germline) to detect somatic alterations. We derived immunoglobulin heavy chain variable (IGHV) mutational status, stereotype and an additional 186 genomic features as previously described (Robbe*, Ridout* et al. Nature Genetics 2022) including known and recently discovered candidate drivers of CLL, recurrent structural variants, mutational signatures, genomic complexity measures, mutational burden, and pathway alterations. All the data was compiled into a matrix to extract meaningful sets of features to cluster patients' genomes using non-negative matrix factorization (NMF). Genomic findings were compared to the largest CLL whole genome cohort of symptomatic patients requiring frontline treatment (n=443). Results. The IGHV mutational status was found unmutated in 31% of individuals and hypermutated in 69% (excluding missing data for 6 genomes). Next, we considered mutations and CNAs in 58 common CLL drivers and 34 regions with recurrent CNAs. Although numbers were lower than in frontline CLL, most individuals presented at least one driver (86%) or more (57%) (by comparison, the CLL frontline cohort presented at least one CLL driver in 98% of patients). Genomic complexity, defined as 4 or more CNAs was detected in 12%, and complex genome defined as the presence of both CN gains and CN losses in 22%, indicating that newly diagnosed monoclonal B-cell lymphocytosis and early asymptomatic CLL present a relatively high degree of genomic complexity similar to what was observed in frontline CLL. The most common driver was del13q (60%), including as a sole driver in a third of these genomes. Other recurrent alterations were IGLL5 mutations (20%), trisomy 12 (9%), and mutations in MYD88 (8%), CREBBP (5%), NOTCH1 (5%), and SPEN (5%). Noncoding elements were also found mutated, including 30 promoters (including BIRC3 in 7%), five UTRs and eight enhancers, including BCL6 and PAX5 enhancers in 18% and 3.4% of individuals, respectively. When comparing this cohort with CLL genomes, we found that alterations in the main cancer pathways were depleted in OXPLORED, including known drivers such as ATM, NOTCH1, SF3B1. Finally, moving away from single-alteration grouping, we clustered the 643 genomes (443 CLL + 200 OXPLORED) using 186 genomic features by applying NMF. Importantly, we found distinct genomic profiles in the early disease cohort, including a subgroup like symptomatic CLL at frontline and another distinct subgroup, hinting towards identifying a genomic subgroup at higher risk of progressing to CLL. Conclusion: The comprehensive characterisation of genomic events that determine progression to malignancy are essential to understand the mechanisms of disease progression. Our study opens new horizons towards identifying individuals at higher risk and might pinpoint to individuals who need early intervention with curative intent. Therefore, we propose to analyse newly generated whole-genome sequencing (WGS) data to establish genomic subgroups in early CLL / pre-malignancy stages.
Read full abstract