Abstract

Abstract Tumor tissue of origin (TOO) is an important factor for guiding treatment decisions. However, TOO cannot be determined for ~3% of metastatic cancer patients that present in the clinic and are categorized as cancers of unknown primary (CUP). As whole genome sequencing (WGS) of tumors is now transitioning from the research domain to diagnostic practice in order to address the increasing demand for biomarker detection, its use for detection of TOO in routine diagnostics also starts becoming within reach. While proof of concept for the use of genome-wide features has been demonstrated before, more complex WGS mutation features, including structural variant (SV) driver and passenger events, have never been integrated into TOO-classifiers even though they bear highly characteristic links with tumor TOO. Using a uniformly processed dataset containing 6820 whole-genome sequenced primary and metastatic tumors (ICGC/PCAWG and Hartwig cohors), we have developed Cancer of Unknown Primary Location Resolver (CUPLR), a random forest based TOO classifier that employs 502 genome-wide features based on simple and complex somatic driver and passenger mutations. Our model is able to distinguish 33 cancer (sub)types with an overall accuracy of 91% and 89% based on cross-validation (n=6139) and hold out set (n=681) predictions, respectively. We found that SV derived features increase the accuracy and utility of TOO classification for specific cancer types. To ensure that predictions are human-interpretable and suited for use in routine diagnostics, CUPLR reports the top contributing features and their values compared to cohort averages. The comprehensive output of CUPLR is complementary to existing histopathological procedures and may thus improve diagnostics for patients with CUP. CUP is the first reimbursed indication for WGS in The Netherlands. We will report on the prospective use and clinical utility of WGS for CUP patients in routine diagnostics but also highlight the added value of CUPLR for unsolved differential diagnosis cases and incidental cases where the independent WGS analysis was discrepant with the original tumor diagnosis. Citation Format: Luan Nguyen, Luuk Schipper, Paul Roepman, Kim Monkhorst, Arne van Hoeck, Petur Snaebjornsson, Edwin Cuppen. Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 2167.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call