▪BackgroundB-precursor acute lymphoblastic leukemia (B-ALL) represents a heterogeneous group of hematological malignancies. Previous studies have identified distinctive gene expression profiles for several molecular subtypes of B-ALL with both biological and clinical importance. However, a proportion of B-ALL have remained poorly characterized with unclear underlying genomic abnormalities. We therefore initated a large-scale international collaborative study to comprehensively reanalyze and delineate the transcriptome landscape of 1,223 B-ALL cases using RNA-seq-driven genomic analyses.MethodsRNA-seq data of 1,223 patients with B-ALL were collected from from Lund University Hospital, the Singapore and Malaysia MaSpore cohort, the Japan Adult Leukemia Study Group (JALSG), Therapeutically Applicable Research to Generate Effective Treatments (TARGET)/Children's Oncology Group(COG) cohort and Multicenter Hematology-Oncology Protocols Evaluation System (M-HOPES) by the Shanghai Institute of Hematology (SIH). We performed gene expression based analyses to identify molecular subtypes and their defining mRNA features. In parallel, we also systematically identified all gene fusions and potential driver mutations in each case. We subsequently defined B-ALL subtype by integrative genomic analysis and comprehensively evaluated their effects on outcomes across B-ALL treatment regimens.ResultsIn this comprehensive analysis of the transcriptomic landscape of 1,223 B-ALL cases, several novel molecular subtypes of B-ALL were identified by strict statistical test capable of scrutinizing subtle genetic features. Totally, fourteen gene expression subgroups (G1-G14) were identified in the integrated B-ALL datasets. Apart from extending eight previously described subgroups (G1-G8 respectively associated to MEF2D fusions, TCF3-PBX1, ETV6-RUNX1/-like, DUX4 fusions, ZNF384 fusions, BCR-ABL1/Ph-like, high hyperdiploidy and MLL fusions), we additionally defined six transcriptome subgroups: G9 associated to both PAX5 and CRLF2 fusions; G10 and G11 respectively to hotspot mutations in PAX5 (p.P80R) and IKZF1 (p.N159Y); G12 to IGH-CEBPE fusion and hotspot mutations in ZEB2 (p.H1038R), while G13 and G14 respectively to TCF3/4-HLF and NUTM1 fusions. We also analyzed the non-silent sequence variants with available WES and RNA-seq data based on an in-house analysis criteria inspired from several published works. We next analyzed non-silent sequence variants in available WES and RNA-seq data based on an in-house analysis criteria inspired by previously published work. We identified 44 genes that were recurrently mutated in at least 1% of the cases (12/1,223 cases). Non-silent variants in NRAS, KRAS, FLT3, KMT2D, PAX5, PTPN11, CREBBP and TP53 exhibited the highest mutation frequencies (3-14%)Conclusion/SummaryLeukemogenic factors contributing to B-ALL are highly heterogeneous. The large-scale cohort transcriptome sequence analysis in B-ALL revealed distinct molecular subgroups that reflect discrete paths of B-ALL, informing disease classification and prognostic stratification. Our work has thus further revealed the complexity of leukemogenesis in B-ALL and suggested the necessity of multiple therapeutic approaches. DisclosuresNo relevant conflicts of interest to declare.