Abstract

BackgroundHuman papilloma virus (HPV) DNA test was applied in cervical cancer screening as an effective cancer prevention strategy. The viral load of HPV generated by different assays attracted increasing attention on its potential value in disease diagnosis and progression discovery.MethodsIn this study, three HPV testing datasets were assessed and compared, including Hybrid Capture 2 (n = 31,954), Aptima HPV E6E7 (n = 3269) and HPV Cobas 4800 (n = 13,342). Logistic regression models for diagnosing early cervical lesions of the three datasets were established and compared. The best variable factor combination (VL + BV) and dataset (HC2) were used for the establishment of six machine learning models. Models were evaluated and compared, and the best-performed model was validated.ResultsOur results show that viral load value was significantly correlated with cervical lesion stages in all three data sets. Viral Load and Bacterial Vaginosis were the best variable factor combination for logistic regression model establishment, and models based on the HC2 dataset performed best compared with the other two datasets. Machine learning method Xgboost generated the highest AUC value of models, which were 0.915, 0.9529, 0.9557, 0.9614 for diagnosing ASCUS higher, ASC-H higher, LSIL higher, and HSIL higher staged cervical lesions, indicating the acceptable accuracy of the selected diagnostic model.ConclusionsOur study demonstrates that HPV viral load and BV status were significantly associated with the early stages of cervical lesions. The best-performed models can serve as a useful tool to help diagnose cervical lesions early.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call