Abstract

Abstract Introduction: Pancreatic ductal adenocarcinoma (PDAC) is a very aggressive and highly lethal malignancy. The majority of cases present with locally advanced unresectable or metastatic disease, which precludes surgical resection and makes cure virtually impossible. However, a subset of tumors with more indolent biology may be candidates for aggressive cytoreduction. Further, synchronous resection of oligometastatic disease is being investigated in prospective trials. We sought to develop a prognostic biomarker that could identify patients with low grade tumors. Methods: The Pancreatic Adenocarcinoma TCGA dataset was analyzed on the cBioPortal platform, comparing 173 patients with non-metastatic PDAC, known tumor grade, and available mRNAseq data. Well-differentiated (G1) tumors were compared to moderately (G2) or poorly differentiated (G3) patients to identify differentially expressed (ΔE) genes. The patients were split into training (n=138) and testing sets (n=35). Low-importance genes were eliminated using a combination of LASSO regression, recursive feature elimination, random forest (RF), and Boruta algorithms. SMOTE was employed to address minority class imbalance. Support vector machine (SVM), RF, and k-nearest neighbors (kNN) were independently trained for the binary classification of G1 vs G2/3 PDAC. The best model and hyperparameters were selected using the mean bootstrapped nested k-fold cross-validation F1-score and evaluated on the held-out test set. Lastly, the model predictions of grade were used to divide patients into two cohorts for Results: There were 1335 ΔE genes between G1 and G2/3 tumors. A panel of 7 genes were unanimously selected by the four low importance gene-elimination algorithms, and used for generation of predictive algorithms. In bootstrapped nested k-fold cross-validation of the training dataset, mean sensitivity and specificity were 0.98 and 0.91 for G1, respectively, and 0.91 and 0.97 for G2/3 PDAC. The kNN model performed best, with a mean F1-score of 0.942 (95% CI: [0.941, 0.943]) on the training set and 0.781 (95% CI: [0.776, 0.786]) for the test set. Conclusion: This study presents a novel machine learning-generated transcriptomic-based tool that predicts PDAC grade with good accuracy, and has the potential to identify less aggressive variants of PDAC that may be more amenable to aggressive cytoreductive approaches. Further testing and validation of the algorithm in larger datasets and using prospectively collected needle biopsy-acquired material is a worthwhile next step, to refine this into a potent clinical risk assessment tool. Citation Format: Daniel Fu, Constantinos P. Zambirinis. A machine learning-derived transcriptomic-based biomarker identifies low grade pancreatic ductal adenocarcinoma tumors and may help in treatment decision making [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Pancreatic Cancer; 2023 Sep 27-30; Boston, Massachusetts. Philadelphia (PA): AACR; Cancer Res 2024;84(2 Suppl):Abstract nr A116.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call