Ovarian cancer (OC), the second leading cause of death among gynecologic cancers, is often diagnosed at an advanced stage due to its asymptomatic nature at early stages. This study aimed to explore the diagnostic potential of plasma-based lipidomics combined with machine learning (ML) in OC. Non-targeted lipidomics analysis was conducted on plasma samples from participants with epithelial ovarian cancer (EOC), benign ovarian tumor (BOT), and healthy control (HC). The samples were randomly divided into a train set and a test set. Differential lipids between groups were selected using two-tailed Student’s t-test and partial least squares discriminant analysis (PLS-DA). Both single lipid-based receiver operating characteristic (ROC) model, and multiple lipid-based ML model, were constructed to investigate the diagnostic value of the differential lipids. The results showed several lipids with significant diagnostic potential. ST 27:2;O achieved the highest prediction accuracy of 0.92 in distinguishing EOC from HC. DG 42:2 had the highest prediction accuracy of 0.96 in diagnosing BOT from HC. Cer d18:1/18:0 had the highest prediction accuracy of 0.65 in differentiating EOC from BOT. Furthermore, multiple lipid-based ML models illustrated better diagnostic performance. K-nearest neighbors (k-NN), partial least squares (PLS), and random forest (RF) models achieved the highest prediction accuracy of 0.96 in discriminating EOC from HC. The support vector machine (SVM) model reached the highest prediction accuracy both in distinguishing BOT from HC, and in differentiating EOC from BOT, with accuracies of 1.00 and 0.74, respectively. In conclusion, this study revealed that the combination of plasma-based lipidomics and ML algorithms is an effective method for diagnosing OC.
Read full abstract