Abstract

Multiple expression signatures for the prediction of the site of origin of metastatic cancers of unknown primary origin (CUP) have been developed. Owing to their limited coverage of tumor types and suboptimal prediction accuracy on distinct tumors there is still room for alternative CUP gene expression signatures. Whereas in past studies CUP classifiers were solely trained on data from tumor samples, we now use expression patterns from normal tissues for classifier training. This approach potentially avoids pitfalls related to the representation of genetically heterogeneous tumor subtypes during classifier training. Two expression data sets of normal human tissues have been reanalysed to derive an expression signature for liver, prostate, kidney, ovarian and lung tissues. In reciprocal validation classifiers trained on either data set achieved overall accuracies greater than 97%. Classifiers trained on combined expression data from both normal tissue data sets were able to predict the site of origin in a cohort of 652 primary tumors with approximately 90% accuracy. Prediction accuracies of primary cancer-based classifiers were in the same range as determined by cross-validation on this cohort. For individual tumor types, normal tissue-based best-centroid classifiers achieved sensitivities ranging from 71 to 99% and specificities ranging from 91 to 99%. Primary origins for 12 of 20 metastases were predicted correctly with false predictions highlighting the need for accurate sample preparation to avoid contaminations by metastases-surrounding tissue. We conclude that gene expression patterns of normal tissues harbor phenotypic information that is retained in tumors and can be sufficient to recover the type of a primary tumor from expression patterns alone.Oncogene advance online publication, 16 November 2009; doi:10.1038/onc.2009.398.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call