52 Background: Identifying the cancer signal of origin is of great value in the clinical use of blood-based cancer testing, especially in those tests that have potential to assess for more than one cancer type. Thus, an approach that can accurately identify cancer signal of origin in order to guide subsequent diagnostic workup is impactful to the adoption of this new technology. Here, we report feasibility data using the DNA methylation signature of plasma cell-free DNA to differentiate between colorectal cancer (CRC) and gastric/esophageal cancer (GEC) types. Methods: We developed an algorithm to distinguish between CRC and GEC based on the DNA methylation signature of plasma cell-free DNA molecules from more than 3,000 differentially methylated regions. First, a regression model (a multi-cancer classifier) was trained to identify the samples with sufficient tumor derived methylation signal using 6,822 cancer samples of 24 cancer types and the samples from 4,423 cancer-free individuals. A second regression model (a CRC/GEC classifier) was trained to distinguish CRC from GEC cases on the samples with positive calls from the multi-cancer classifier. The second model was trained on methylation signals from advanced stage CRC samples, advanced stage GEC samples (including esophageal, gastroesophageal, and gastric cancers), and samples from cancer-free individuals. Results: The multi-cancer classifier identified 92% (2,954/3,204) of CRC and GEC samples as positive at 90% target specificity, including 93% (2,086/2,253) of CRC samples and 91% (868/951) of gastric/esophageal cancer samples. The performance of the CRC/GEC classifier is evaluated through 10-fold cross validation on these 3,204 cancer samples, where in each fold, an additional 6,298 samples from cancer-free individuals were also used in training. Among the 2,954 CRC and GEC samples detected by the multi-cancer classifier, 2,698 (91%) were accurately classified by the CRC/GEC classifier. Among the detected CRC samples, 91% (1,907/2,086) were correctly classified as CRC. Similarly, 91% (791/868) of the detected GEC samples were correctly classified. The precision of the CRC/GE prediction was also assessed and 91% (2,698/2,954) of the detected cancer samples were correctly classified. Of the 1,984 detected cancer samples predicted as CRC, 96% (1,907/1,984) were true CRC samples. Of the 970 detected cancer samples classified as GEC, 82% (791/970) were correct. Conclusions: This assay with high sensitivity and accurate CRC/GEC classification can effectively guide subsequent diagnostic workups following a positive result, maximizing potential clinical utility. Ongoing work will continue further refinements.
Read full abstract