Abstract

Abstract Accurate identification of a primary origin of metastatic tumors is essential for optimizing treatment and involves the integration of multiple forms of data during the examination of tissue by a pathologist. However, despite the use of highly sensitive and specific immunohistochemical stains for some cell lineages, pathologists cannot reliably determine the origin of every metastatic tumor, with 1-2% classified as cancers of unknown primary (CUP) even with the integration of other clinical data [1]. Previous work has shown the possibility of using artificial intelligence algorithms to predict primary origin using histology [2] or different forms of molecular data, including genomics [3], transcriptomics [4], or methylation profiles [5]. We present a multimodal deep learning algorithm that leverages routinely acquired histology slides, associated clinically-available genomics data, and patient sex to classify tumors into 18 different primary origins. Our approach shows substantial improvement over unimodal deep learning using histology or genomic data alone, achieving an accuracy of 88.1% and 92.0% on a held-out test (n=4,881) and external test set (n=660), respectively. Furthermore, on CUP cases (n=283), we observed an agreement of 85.5% between the model’s three most likely predicted origins and the differential diagnoses assigned in the associated pathology reports. At test time, our flexible model design enables origin prediction to be made from only histology or genomics alone, if necessary due to missing data. Additionally, our model allows us to perform interpretability studies to observe which parts of the histology and which genes contribute most to the prediction of a particular origin, a potentially useful tool for quality control and knowledge discovery.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call