Abstract

Alpha-fetoprotein (AFP)-negative hepatocellular carcinoma (ANHCC) patients account for more than 30% of the whole entity of HCC patients and are easily misdiagnosed. This three-phase study was designed to find and validate new ANHCC N-glycan markers which identified from The Cancer Genome Atlas (TCGA) database and noninvasive detection. Differentially expressed genes (DEGs) of N-glycan biosynthesis and degradation related genes were screened from TCGA database. Serum N-glycan structure abundances were analyzed using N-glycan fingerprint (NGFP) technology. Totally 1340 participants including ANHCC, chronic liver diseases and healthy controls were enrolled after propensity score matching (PSM). The Lasso algorithm was used to select the most significant N-glycan structures abundances. Three machine learning models [random forest (RF), support vector machine (SVM) and logistic regression (LR)] were used to construct the diagnostic algorithms. All 13N-glycan structure abundances analyzed by NGFP demonstrated significant and was enrolled by Lasso. Among the three machine learning models, LR algorithm demonstrated the best diagnostic performance for identifying ANHCC in training cohort (LR: AUC: 0.842, 95%CI: 0.784-0.899; RF: AUC: 0.825, 95%CI: 0.766-0.885; SVM: AUC: 0.610, 95%CI: 0.527-0.684). This LR algorithm achieved a high diagnostic performance again in the independent validation (AUC: 0.860, 95%CI: 0.824-0.897). Furthermore, the LR algorithm could stratify ANHCC into two distinct subgroups with high or low risks of overall survival and recurrence in follow-up validation. In conclusion, the biomarker panel consisting of 13N-glycan structures abundances using the best-performing algorithm (LR) was defined and indicative as an effective tool for HCC prediction and prognosis estimate in AFP negative subjects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call