Abstract
We describe the development of a molecular assay from publicly available tumor tissue mRNA databases using machine learning and present preliminary evidence of functionality as a diagnostic and monitoring tool for prostate cancer (PCa) in whole blood. We assessed 1055 PCas (public microarray data sets) to identify putative mRNA biomarkers. Specificity was confirmed against 32 different solid and hematological cancers from The Cancer Genome Atlas (n = 10,990). This defined a 27-gene panel which was validated by qPCR in 50 histologically confirmed PCa surgical specimens and matched blood. An ensemble classifier (Random Forest, Support Vector Machines, XGBoost) was trained in age-matched PCas (n = 294), and in 72 controls and 64 BPH. Classifier performance was validated in two independent sets (n = 263 PCas; n = 99 controls). We assessed the panel as a postoperative disease monitor in a radical prostatectomy cohort (RPC: n = 47). A PCa-specific 27-gene panel was identified. Matched blood and tumor gene expression levels were concordant (r = 0.72, p < 0.0001). The ensemble classifier ("PROSTest") was scaled 0%-100% and the industry-standard operating point of ≥50% used to define a PCa. Using this, the PROSTest exhibited an 85% sensitivity and 95% specificity for PCa versus controls. In two independent sets, the metrics were 92%-95% sensitivity and 100% specificity. In the RPCs (n = 47), PROSTest scores decreased from 72% ± 7% to 33% ± 16% (p < 0.0001, Mann-Whitney test). PROSTest was 26% ± 8% in 37 with normal postoperative PSA levels (<0.1 ng/mL). In 10 with elevated postoperative PSA, PROSTest was 60% ± 4%. A 27-gene whole blood signature for PCa is concordant with tissue mRNA levels. Measuring blood expression provides a minimally invasive genomic tool that may facilitate prostate cancer management.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have