Abstract
Pancreatic ductal adenocarcinoma (PDAC) is an aggressive and lethal cancer deeply affecting human health. Diagnosing early-stage PDAC is the key point to PDAC patients’ survival. However, the biomarkers for diagnosing early PDAC are inexact in most cases. Therefore, it is highly desirable to identify an effective PDAC diagnostic biomarker. In the current work, we designed a novel computational approach based on within-sample relative expression orderings (REOs). A feature selection technique called minimum redundancy maximum relevance was used to pick out optimal REOs. We then compared the performances of different classification algorithms for discriminating PDAC and its adjacent normal tissues from non−PDAC tissues. The support vector machine algorithm is the best one for identifying early PDAC diagnostic biomarker. At first, a signature composed of nine gene pairs was acquired from microarray gene expression data sets. These gene pairs could produce satisfactory classification accuracy up to 97.53% in fivefold cross-validation. Subsequently, two types of data from diverse platforms, namely, microarray and RNA-Seq, were used to validate this signature. For microarray data, all (100.00%) of 115 PDAC tissues and all (100.00%) of 31 PDAC adjacent normal tissues were correctly recognized as PDAC. In addition, 88.24% of 17 non-PDAC (normal or pancreatitis) tissues were correctly classified. For the RNA-Seq data, all (100.00%) of 177 PDAC tissues and all (100.00%) of 4 PDAC adjacent normal tissues were correctly recognized as PDAC. Validation results demonstrated that the signature had a good cross-platform effect for early detection of PDAC. This work developed a new robust signature that might be a promising biomarker for early PDAC diagnosis.
Highlights
Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest malignant carcinomas and it accounts for at least 95% of all pancreatic cancer cases (Tanaka, 2016)
With the relative expression orderings elaborated in Materials and Methods section, for 458 PDAC samples and 122 PDAC adjacent normal samples in the training set, there were 30,865,512 and 49,177,748 stable gene pairs, respectively
On the basis of the novel profiles, we captured the optimal feature set from the 16 gene pairs by using minimum redundancy maximum relevance (mRMR) with support vector machine (SVM), decision tree, logistic regression, random forest, naïve Bayes, and Bayes net
Summary
Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest malignant carcinomas and it accounts for at least 95% of all pancreatic cancer cases (Tanaka, 2016). Diagnosis of Pancreatic Ductal Adenocarcinoma specific early characteristics during the early stage, which means that early PDAC cannot be detected timely and causes missed chances for surgery. We could obtain diagnostic signatures with qualitative transcriptional information through exploiting the relative expression ordering (REO) method. The REO method is highly robust to experimental batch effects (Eddy et al, 2010; Cai et al, 2015; Zhao et al, 2016) and platform differences (Guan et al, 2016; Cheng, 2019). The REO strategy has been successfully used to identify the early diagnosis signature of malignant carcinoma, such as gastric cancer (Yan et al, 2019), hepatocellular carcinoma
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.