Introduction: Pancreatic ductal adenocarcinoma (PDAC) has the lowest survival rate among all major cancers due to a lack of symptoms in early stages, early detection tools, and optimal therapies for late-stage patients. Thus, effective and non-invasive diagnostic tests are greatly needed. Recently, circulating miRNAs have been reported to be altered in PDAC. They are promising biomarkers because of stability in the blood, ease of non-invasive detection, and convenient screening methods. This study aimed to use blood-based miRNA biomarkers and various analysis methods in the development of a machine-learning (ML) model for PDAC. Methods: Blood-based miRNAs associated with PDAC were collected from open sources. miRNA sequences, targeted genes, and involved pathways were used to construct a set of descriptors for an ML model. Results: Bioinformatics analysis revealed that most genes in pancreatic cancer and insulin signaling pathways were targeted by the PDAC-related miRNAs. The best-performing ML model with the Random Forest classifier was able to achieve an accuracy of 88.4%. Model evaluations of an independent PDAC-associated miRNAs test set had 100% accuracy while non-cancer miRNAs had 52.4% accuracy, indicating specificity to PDAC. Conclusions: Our results suggest an ML model developed using blood-based miRNA biomarkers’ target gene, pathway, and sequence features could be potentially implicated in PDAC diagnostics.
Read full abstract