Abstract

3051 Background: Orphan non-coding RNAs (oncRNAs) are a novel category of small RNAs (smRNAs) that are present in tumors and largely absent in healthy tissue. We investigated the utility of oncRNAs extracted from serum for early cancer detection across seven cancer types. Methods: We collected 2,882 serum samples from individuals with known cancers of the bladder ( n=152), breast (220), colon and rectum (141), kidney (283), lung (281), pancreas (287), and stomach (280) as well as donors with no history of cancer (1,238). We used 0.5 mL serum aliquots to generate and sequence smRNA libraries at an average depth of 20 million 50-bp single-end reads. Samples were split into age-, sex-, and smoking status-matched training (1,232 cancer; 922 control) and validation (412 cancer; 316 control) cohorts. A large catalog of oncRNAs specific to each cancer was created using tumor and adjacent normal samples from The Cancer Genome Atlas (TCGA) smRNA-seq database. Using TCGA-derived oncRNAs, we trained a machine learning model to predict cancer presence and tissue of origin (TOO) in a 5-fold cross validation setup using our training cohort. For the validation cohort, we averaged the predictions from the five training cohort models. Results: The model ROC-AUC for detecting cancer was 0.95 (95% CI: 0.94–0.95 for training and 0.94–0.97 for validation cohorts). Sensitivities for detecting cancer at 95% specificity were 0.74 (0.70–0.76) for early stage (I/II) and 0.80 (0.76–0.84) for late stage (III/IV) cancers in the training cohort, and 0.77 (0.71–0.81) and 0.81 (0.73–0.87) in the validation cohort. Sensitivities of detection for each cancer type are shown. For samples with cancer and TOO predictions, our top 1 and top 2 TOO accuracy was 0.76 (0.68–0.84) and 0.83 (0.76–0.90) for the validation set. Conclusions: These results demonstrate that oncRNAs detected in serum can be used for accurate, early detection, and localization of multiple cancers. [Table: see text]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call