To establish decision tree and logistic regression classification models for diagnosing pancreatic adenocarcinoma (PaCa) and for screening serum biomarkers related to evaluation of different stages and curative effects. Serum samples obtained from subjects with pancreatic adenocarcinoma (n = 58) and normal pancreas (n = 51) were applied to strong anion exchange chromatography (SAX2) chips for protein profiling by SELDI-TOF-MS to screen multiple serum biomarkers. Biomarker Wizard software and several statistical methods including algorithm of decision tree, logistic regression and ROC curves were used to construct the decision tree or logistic regression classification models. Average of 61 mass peaks were detected at the molecular range of 2000-30,000, ten decision trees with the highest cross validation rate were chosen to construct the classification models, which can differentiate PaCa from normal pancreas with a sensitivity of 83.3% and a specificity of 100%. Logistic regression was used to achieve the AUC (0.976 +/- 0.011, P < 0.001) with a sensitivity of 77.6% - 91.4% and a specificity of 92.2% - 100%. Six mass peaks were combined by logistic regression to achieve the AUC 0.897 +/- 0.054, 0.978 +/- 0.021 and 0.792 +/- 0.107 (P < 0.05) in the three groups (patients at stage I and II, stage II and III, stage III and IV). One mass peak (M/Z 4,016) was screened (P < 0.05) significantly between the preoperative and postoperative PaCa samples and the intensity decreased weeks after operation. Decision tree and logistic regression classification models of the mass peaks screened by SELDI-TOF-MS serum profiling can be used to differentiate pancreatic adenocarcinoma from normal pancreas, and is superior to CA 199. The detected mass peaks are helpful for the evaluation of curative effect and prognosis of pancreatic adenocarcinoma.
Read full abstract