Abstract
Objective:This study aimed to classify open-access gene expression data of patients with hepatitis B virus-related hepatocellular carcinoma (HBV + HCC) and chronic HBV without HCC (HBV alone) using the XGBoost method, one of the machine learning methods, and reveal important genes that may cause HCC.Methods:This case-control study used the open-access gene expression data of patients with HBV + HCC and HBV alone. Data from 17 patients with HBV + HCC and 36 patients with HBV were included. XGBoost was constructed for the classification via 10-fold cross-validation. Accuracy, balanced accuracy, sensitivity, selectivity, positive-predictive value, and negative-predictive value performance metrics were evaluated for model performance.Results:According to the feature-selection method, 18 genes were selected, and modeling was performed with these input variables. Accuracy, balanced accuracy, sensitivity, specificity, positive-predictive value, negative-predictive value, and F1 score obtained from XGBoost model were 98.1%, 98.6%, 100%, 97.2%, 94.4%, 100%, and 97.1%, respectively. Based on the predictor importance findings acquired from XGBoost, the RNF26, FLJ10233, ACBD6, RBM12, PFAS, H3C11, and GKP5 can be employed as potential biomarkers of HBV-related HCC.Conclusions:In this study, genes that may be possible biomarkers of HBV-related HCC were determined using a machine learning-based prediction approach. After the reliability of the obtained genes are clinically verified in subsequent research, therapeutic procedures can be established based on these genes, and their usefulness in clinical practice may be documented.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have