Abstract Objectives: Sarcomas are mesodermal cancers of bone and soft tissue of which there are >60 malignant varieties, many of which can be difficult to diagnose or subtype using traditional histopathology. A universal molecular definition of sarcoma types would therefore be an invaluable tool to the diagnostic pathologist. RNA has the potential to offer a complimentary perspective to cytogenetic- and methylation-based diagnostics, as it represents the active state of the disease at sampling and better reveals its phenotype. Recognizing the potential for RNA-based classification, we set out to create a first-generation transcriptional atlas of sarcoma. Methods: To develop transcriptional definitions of cancers with the potential to further subclassify tumor types, we designed a self-optimizing and scale-adaptive unsupervised method (RACCOON), which groups samples into hierarchically organized clusters. We used this approach on the UCSC Treehouse Childhood Cancer Compendium, a set of 2,178 pediatric and 9,400 adult tumors, 1,130 of which are sarcomas, as well as 1,735 non-neoplastic samples. We then trained an ensemble of convolutional neural networks to classify tumors to these transcriptional clusters. We have now added 624 more sarcoma samples from Toronto centers and international collaborators to better represent the breadth of sarcoma. We are actively sequencing 500 additional samples in partnership with the Gabriella Miller Kids First Research Program to yield an expanded cohort of >2,200 uniformly processed and analyzed sarcomas. Results: Sarcomas organize into two clusters at the highest hierarchical level: one characterized by entities which occur primarily in adults and resemble mature tissue, the other by primarily pediatric entities which exhibit high stemness and resemble embryonic tissue. Several included entities are not bona fide sarcomas but originate from the mesoderm (e.g., Wilms Tumor) signifying a common transcriptional identity for mesodermal neoplasms. Additionally, we demonstrate the first transcriptional subtypes of central osteosarcoma reflecting its major histotypes and representing divergent clinical courses. We also determine Ewing Sarcoma (ES) to be a distinct entity which clusters separately from all other cancers, raising questions of its origin and affinity to sarcoma. When classifying ongoing patients to the atlas, we correctly classified >85% of tumors and corrected the diagnosis of 7%. We find 14% of ES in our dataset were likely misdiagnosed CIC- or BCOR-driven sarcomas. Critically, assigned subtypes are consistent between primary and relapse pairs. Conclusion: RNA-seq is a promising tool for both subtype discovery and classifying sarcoma in ongoing patients. We have already included this tool in tumor boards to help inform patient care. Our method reveals the overarching organization of sarcoma for the first time and specifies its underlying biology. This atlas is ever-growing and is open to the community to contribute. Citation Format: Joshua O. Nash, Federico Comitani, Rose Chami, Sarah Cohen-Gogo, Astra Chang-Schwertschkow, Yael Babichev, Jodi Lees, Noa Alon, Nalan Gokgoz, Stephen Man Yu, Kyoko Yuki, Miranda Lorenti, Zhanqin Liu, Alaina McGoey, Famida Spatare, Bernarld Castro, Kim Tsoi, Hagit Peretz Soroka, Jack Brzezinski, Anita Villani, Albiruni Razak, Abha Gupta, Elizabeth Demicco, Gino Somers, Brendan C. Dickson, Jay S. Wunder, Irene L. Andrulis, David Malkin, Rebecca A. Gladdy, Adam Shlien. The development of a multiscale transcriptional atlas of sarcoma [abstract]. In: Proceedings of the AACR Special Conference: Sarcomas; 2022 May 9-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2022;28(18_Suppl):Abstract nr B027.
Read full abstract