Small blue round cell sarcomas (SBRCSs) are a heterogeneous group of tumors with overlapping morphologic features but markedly varying prognosis. They are characterized by distinct chromosomal alterations, particularly rearrangements leading to gene fusions, whose detection currently represents the most reliable diagnostic marker. Ewing sarcomas (ESs) are the most common SBRCSs, defined by gene fusions involving EWSR1 and transcription factors of the ETS gene family, while the most frequent non-EWSR1-rearranged SBRCSs harbor a CIC rearrangement. Unfortunately, the identification of CIC::DUX4 translocation events, the most common CIC rearrangement, is challenging with current methods. Here, a machine-learning approach to support SBRCS diagnosis relying on gene expression profiles measured via targeted sequencing is presented. The analyses on a curated cohort of 69 soft tissue tumors (STT) showed markedly distinct expression patterns for SBRCS subgroups. A random forest (RF) classifier trained on ES and CIC-rearranged cases predicted probabilities of being CIC-rearranged > 0.9 for CIC-rearranged-like sarcomas and < 0.6 for other SBRCS. Testing on a retrospective cohort of 1335 routine diagnostic cases identified 15 candidate CIC-rearranged tumors with a probability > 0.75, all of which supported by expert histopathological reassessment. Furthermore, the multi-gene RF classifier appeared advantageous over using high ETV4 expression alone, previously proposed as surrogate to identify CIC rearrangement. Taken together, the expression-based classifier can offer valuable support for SBRCS pathological diagnosis.
Read full abstract