Background: The advantages of genome-wide sequencing approaches over conventional methods in CLL diagnostics are a matter of debate and exact guidelines for the application of next-generation sequencing in a diagnostic context are currently missing. Aim: Compare the accuracy of whole genome sequencing (WGS) and whole transcriptome sequencing (WTS) in determining the presence of point mutations, IGHV mutational status (IGHVms), and chromosomal aberrations to conventional procedures in a clinical setting. Patients and Methods: The cohort comprised 317 CLL patients. Diagnosis was established following WHO guidelines. WGS (100x, 2x151bp) and WTS (5x10 7 reads, 2x101bp) data were generated on NovaSeq instruments. For routine diagnostics, sequencing was carried out by Sanger or targeted panel sequencing (1500x, 2x101bp). Chromosome banding analysis (CBA) and FISH were available for 231 patients. WGS data were analyzed for chromosomal aberrations and copy number alterations using Manta and GATK. IGHVms was determined by Sanger sequencing and/or fragment analysis for 248 patients. Analysis of IGHVms was carried out using IgCaller (WGS) and MiXCR (WTS). Results: Comparing copy number variants (CNV) and balanced structural variants (SV) between CBA and WGS revealed a high concordance. In detail, 297 CNV were detected by both techniques while 32 and 52 were detected only by CBA (small subclones) or by WGS, respectively. 33 balanced SV were detected by both while 8 and 3 were detected only by CBA and WGS, respectively. A complex karyotype (≥ 3 abnormalities) was observed by both techniques in 55 cases, while 5 and 10 cases were categorized as complex by CBA only or WGS only, respectively. Survival analysis of cases with 3 to 5 chromosomal aberrations predicted by routine or WGS revealed similar hazard ratios that were positively correlated with the number of aberrations. For comparing SNV calls from routine and WGS, we investigated genes with validated mutations in at least ten patients in routine diagnostics. This criterion was met by ATM, MYD88, NOTCH1, SF3B1, and TP53. We observed high concordance between routine and WGS data ( ATM 99.4%, MYD88 100%, NOTCH1 97.8%, SF3B1 100%, TP53 99.1%). Five mutations were detected in routine only and three in WGS only (routine only/WGS only: ATM 1/0, MYD88 0/0, NOTCH1 2/2, SF3B1 0/0, TP53 2/1) and found to be caused by low variant allele frequency (5 miscalls, VAF < 5%) and/or filtering out of low quality variants (5 cases). IGHV rearrangements analyzed by Sanger (248 patients), WGS (242/248 patients), and WTS (236/248 patients) data were compared for matches in the rearranged IGHV. IGHVms calling employed 98% of sequence identity as cutoff. Calling the rearranged IGHV was concordant in 93%, 95%, and 92% when comparing Sanger/MiXCR, Sanger/IgCaller, and MiXCR/IgCaller calls, respectively. For IGHVms, the corresponding percentages of concordance were 91%, 86%, and 83%. Mismatches in calling the IGHVms were found in borderline cases. Detection of stereotyped IGHV was fully concordant. In addition to reliably replicating routine results, we find that WGS/WTS data can also outperform routine procedures. In our hands, the highest hazard ratio (4.94) in survival analysis distinguishing mutated CLL (mCLL) from unmutated CLL (uCLL) is observed using WTS for IGHVms determination and 97% sequence identity as cutoff (208 patients, up to 30 years follow up). Survival for borderline cases (97.00-97.99% sequence identity) was statistically indistinguishable from uCLL in routine diagnostics as well as WGS/WTS. Furthermore, using WGS, CN-LOH regions and deletions below the detection limit of CBA (10 MB) can be detected. We observed such regions in samples with mutated TP53 (5 CN-LOH), ATM (2 CN-LOH, 4 deletions < 10MB), and deletions of minimally overlapping region chr8:21,700,000-23,600,000, whose loss was associated with poor survival (3/29 deletions < 3.1 MB). Conclusions: WGS/WTS data reliably replicate results obtained by standard procedures in a diagnostic setting. Sequencing offers advantages such as a unified workflow, better sensitivity for small deletions, CN-LOH detection, genome-wide analyses of coding and non-coding mutations, among others. Challenges of short read based sequencing regard the detection limit for point mutations and specificity/sensitivity in detecting chromosomal aberrations.
Read full abstract