Abstract

Existing biomarkers for ovarian cancer lack sensitivity and specificity. We compared the diagnostic efficacy of nonlinear machine learning and linear statistical models for diagnosing ovarian cancer using a combination of conventional laboratory indicators. We divided 901 retrospective samples into an ovarian cancer group and a control group, comprising non-ovarian malignant gynecological tumor (NOMGT), benign gynecological disease (BGD), and healthy control subgroups. Cases were randomly assigned to training and internal validation sets. Two linear (logistic regression (LR) and Fisher's linear discriminant (FLD)) and three nonlinear models (support vector machine (SVM), random forest (RF), and artificial neural network (ANN)) were constructed using 22 conventional laboratory indicators and three demographic characteristics. Model performance was compared. In an independent prospectively recruited validation set, the order of diagnostic efficiency was RF, SVM, ANN, FLD, LR, and carbohydrate antigen 125 (CA125)-only (AUC, accuracy: 0.989, 95.6%; 0.985, 94.4%; 0.974, 93.4%; 0.915, 82.1%; 0.859, 80.1%; and 0.732, 73.0%, respectively). RF maintained satisfactory classification performance for identifying different ovarian cancer stages and for discriminating it from NOMGT-, BGD-, or CA125-positive control. Nonlinear models outperformed linear models, indicating that nonlinear machine learning models can efficiently use conventional laboratory indicators for ovarian cancer diagnosis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call