Development and Evaluation of Novel Statistical Methods in Urine Biomarker-Based Hepatocellular Carcinoma Screening

Jeremy Wang,Wei Song,Surbhi Jain,Ying-Hsiu Su,Chi-Tan Hu,Dion Chen,Ying-Hsiu Su

doi:10.1038/s41598-018-21922-9

Abstract

Hepatocellular carcinoma is one of the fastest growing cancers in the US and has a low survival rate, partly due to difficulties in early detection. Because of HCC’s high heterogeneity, it has been suggested that multiple biomarkers would be needed to develop a sensitive HCC screening test. This study applied random forest (RF), a machine learning technique, and proposed two novel models, fixed sequential (FS) and two-step (TS), for comparison with two commonly used statistical techniques, logistic regression (LR) and classification and regression trees (CART), in combining multiple urine DNA biomarkers for HCC screening using biomarker values obtained from 137 HCC and 431 non-HCC (224 hepatitis and 207 cirrhosis) subjects. The sensitivity, specificity, area under the receiver operating curve, and variability were estimated through repeated 10-fold cross-validation to compare the models’ performances in accuracy and robustness. We show that RF and TS have higher accuracy and stability; specifically, they reach 90% specificity and 86%/87% sensitivity respectively along with 15% higher sensitivity and 10% higher specificity than LR in cross-validation. The potential of RF and TS to develop a panel of multiple biomarkers and the possibility for self-training, cloud-based models for HCC screening are discussed.

Highlights

To analyze multiple variables and generate algorithms for classification, many different multivariate models can be applied (e.g. k-nearest neighbor and Bayesian classifiers, etc)[20,21]
The results from both model building and cross-validation datasets suggest that TS and random forest (RF) improve upon AFP, logistic regression (LR), classification and regression trees (CART), and fixed sequential (FS), in developing four genetic and epigenetic biomarkers into a potentially robust and sensitive HCC screening test
The algorithms (FS and TS) developed in our study comprised AFP and three urine DNA markers and achieved up to 87% sensitivity and 90% specificity in the validation set, while AFP alone achieved 99% specificity but only 48.2% sensitivity based on the cutoff of 20 ng/mL as recommended by Association for the Study of Liver Disease (AASLD)

Summary

Introduction

To analyze multiple variables and generate algorithms for classification, many different multivariate models can be applied (e.g. k-nearest neighbor and Bayesian classifiers, etc)[20,21]. This study used biomarker values obtained from the study cohort of 137 HCC and 431 non-HCC (224 hepatitis and 207 cirrhosis) to compare five multivariate models, LR, CART, RF, FS, and TS, based on robustness and predictive accuracy and identify the best model for developing multiple biomarkers into a single panel for use as a sensitive HCC screening test. Robustness was examined via variation in the validation data of each iteration The results from both model building and cross-validation datasets suggest that TS and RF improve upon AFP, LR, CART, and FS, in developing four genetic and epigenetic biomarkers into a potentially robust and sensitive HCC screening test

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Feb 28, 2018
Citations: 23	License type: open-access

R Discovery Prime

R Discovery Prime

Development and Evaluation of Novel Statistical Methods in Urine Biomarker-Based Hepatocellular Carcinoma Screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

A comparison of statistical methods for the detection of hepatocellular carcinoma based on serum biomarkers and clinical variables.
Mengjun Wang ... Karthik Devarajan
BMC Medical Genomics | VOL. Suppl 6 3
Mengjun Wang, et. al.Mengjun Wang ... Karthik Devarajan
01 Nov 2013
BMC Medical Genomics | VOL. Suppl 6 3

Predicting the Mortality of ICU Patients by Topic Model with Machine-Learning Techniques
Chih-Chou Chiu ... Ling-Jing Kao
Healthcare | VOL. 10
Chih-Chou Chiu, et. al.Chih-Chou Chiu ... Ling-Jing Kao
11 Jun 2022
Healthcare | VOL. 10

Improving propensity score weighting using machine learning.
Brian K Lee ... Elizabeth A Stuart
Statistics in Medicine | VOL. 29
Brian K Lee, et. al.Brian K Lee ... Elizabeth A Stuart
03 Dec 2009
Statistics in Medicine | VOL. 29

AB1479 TO EACH THEIR OWN- CLASSIFICATION OF ARTHRITIS PHENOTYPE AMONG INFLAMMATORY ARTHRITIS PATIENTS WITH CANCER TREATED WITH IMMUNE CHECKPOINT INHIBITORS
D Jannat-Khah ... A Bass
Annals of the Rheumatic Diseases | VOL. 82
D Jannat-Khah, et. al.D Jannat-Khah ... A Bass
30 May 2023
Annals of the Rheumatic Diseases | VOL. 82

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Development and Evaluation of Novel Statistical Methods in Urine Biomarker-Based Hepatocellular Carcinoma Screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports