Abstract
Abstract Background: Tumors of unknown origin account for up to 5% of newly diagnosed cancers and the average survival time is 9 to 12 months from diagnosis. Establishing tumor type and subtype guides standard of care treatment for several NCCN targeted therapy guidelines. Methods: Targeted DNA sequencing for more than 500 cancer-associated genes and exome-capture RNA sequencing was carried out in more than 25,000 fresh frozen or paraffin embedded tumor samples, including both primary and metastatic tumors. Mutations, copy number variants, and viral sequences were detected from DNA sequencing while gene expression and fusion events were determined from RNA sequencing. We aimed to predict cancer type by utilizing multiple machine learning models trained on individual data types and harmonize predictions across multiple data types. Results: The transcriptome model predicts more than 60 unique diagnoses covering both solid and hematological cancers with >90% overall accuracy on a held-out test set. Of note, the model can accurately predict 10 subtypes of sarcoma and 6 subtypes of neuroendocrine tumors. Gene expression and splicing were the most informative data types, but a performant DNA-only model was also evaluated for application when only DNA data is available. Finally, we evaluated the model on an unlabeled cohort of poorly differentiated samples with inconclusive diagnosis. Conclusions: The incorporation of multiple modes of omics data can improve the interpretability and robustness of machine learning models to predict cancer diagnosis. Citation Format: Jackson Michuda, Benjamin Leibowitz, Shlomit Amar-Farkash, Crystal Bevis, Alessandra Breschi, Joshuah Kapilivsky, Catherine Igartua, Joshua S. Bell, Kyle A. Beauchamp, Kevin White, Martin Stumpe, Nike Beaubier, Timothy Taxter. Multimodal prediction of diagnosis for cancers of unknown primary [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 5423.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.