Abstract Background: The recent literature proposes several diagnostic Machine Learning (ML) models for ovarian cancer based on miRNA expression profiling. These ML models are trained and validated on subjects whose miRNA sample was drawn proximate to cancer diagnosis (e.g., within one month). Whether these models remain useful remote from the time of diagnosis, when early detection or prevention would be most relevant, is unclear. Therefore, we examine the effect of time between blood draw and cancer diagnosis on miRNA-based ML model accuracy and the relevance of a cancer probability score, Pc, for estimating long-term cancer risk. Methods: The study is based on the miRNA expression profiles of 2983 total subjects, which is comprised of 1829 subjects collected as part of the Biobank at Mass General Brigham (MGB), 110 samples from the Pelvic Mass Protocol at Brigham and Women's Hospital (Cramer), and 1044 samples which were obtained from the Prostate, Lung, Colorectal, and Ovarian (PLCO) cancer screening trial. miRNA expression was measured using a pre-specified panel of 179 miRNAs optimized for serum detection using the Fireplex® circulating miRNA assay. We trained ML models using 1865 controls and 74 ovarian cancer subjects (1829 Biobank + 55 Cramer + 55 PLCO). The training cancer samples were drawn between 1 day and 14 years from diagnosis, most within 30 days. The models were then validated on 769 control and 275 ovarian cancer subjects (989 PLCO + 55 Cramer) whose time to cancer diagnosis ranges between 1 and 1814 days (up to 5 years) after blood draw. Performance was reported as Area Under the receiver operator characteristic Curve (AUC). Results: Among the validation set cases, we observe a decreasing trend in predicted cancer probability (Pc) with increasing log time (R = -0.34, p < 0.0001), and we see a similar trend in AUC score with time. On samples drawn within 21 days of cancer diagnosis, the ML model offers an AUC = 0.88, which decreases to AUC = 0.72 on samples drawn between 21 days and one year from diagnosis, and later plateaus at AUC ~ 0.72 up to five years from diagnosis. Out of 2928 total subjects considered in the study, 286 had multiple blood draws taken over a 5-year time period. Using this data, we analyze the change in Pc over time per subject. The results show that Pc, for the average case subject, increased by 7% per year (p = 0.02). In contrast, when we applied the same analysis to control subjects, Pc increased by only 1% per year on average (p = 0.62). Thus, monitoring changes in Pc at regular intervals could be an informative cancer diagnostic. In terms of relative risk, subjects with Pc < 0.5 had a relative 5-year cancer risk of 0.74, whereas subjects with Pc > 0.5 had relative risk 7.4 (i.e., an order of magnitude higher). Conclusion: The results indicate that miRNA-based ML models can be used to identify individuals at increased long-term risk of ovarian cancer and provide a tool for profiling at regular intervals to allow earlier diagnosis of disease. Citation Format: James Webber, Laura Wollborn, Sudhanshu Mishra, Stephanie Alimena, Bryanna Testino, Allison Vitonis, Daniel Cramer, Dipanjan Chowdhury, Kevin Elias. Time to diagnosis analysis using miRNA-based ovarian cancer prediction models [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3896.
Read full abstract