Abstract

BackgroundPathogenic testing for tuberculosis (TB) is not yet sufficient for early and differential clinical diagnosis; thus, we investigated the potential of screening long non-coding RNAs (lncRNAs) from human hosts and using machine learning (ML) algorithms combined with electronic health record (EHR) metrics to construct a diagnostic model.MethodsA total of 2,759 subjects were included in this study, including 12 in the primary screening cohort [7 TB patients and 5 healthy controls (HCs)] and 2,747 in the selection cohort (798 TB patients, 299 patients with non-TB lung disease, and 1,650 HCs). An Affymetrix HTA2.0 array and qRT-PCR were applied to screen new specific lncRNA markers for TB in individual nucleated cells from host peripheral blood. A ML algorithm was established to combine the patients’ EHR information and lncRNA data via logistic regression models and nomogram visualization to differentiate PTB from suspected patients of the selection cohort.ResultsTwo differentially expressed lncRNAs (TCONS_00001838 and n406498) were identified (p < 0.001) in the selection cohort. The optimal model was the “LncRNA + EHR” model, which included the above two lncRNAs and eight EHR parameters (age, hemoglobin, lymphocyte count, gamma interferon release test, weight loss, night sweats, polymorphic changes, and calcified foci on imaging). The best model was visualized by a nomogram and validated, and the accuracy of the “LncRNA + EHR” model was 0.79 (0.75–0.82), with a sensitivity of 0.81 (0.78–0.86), a specificity of 0.73 (0.64–0.79), and an area under the ROC curve (AUC) of 0.86. Furthermore, the nomogram showed good compliance in predicting the risk of TB and a higher net benefit than the “EHR” model for threshold probabilities of 0.2–1.ConclusionLncRNAs TCONS_00001838 and n406498 have the potential to become new molecular markers for PTB, and the nomogram of “LncRNA + EHR” model is expected to be effective for the early clinical diagnosis of TB.

Highlights

  • Tuberculosis (TB) is a major global infectious disease caused by Mycobacterium tuberculosis (MTB) infection that poses a serious risk to human health (World Health Organization [WHO], 2020)

  • Four candidate long non-coding RNAs (lncRNAs) were selected from the HTA2.0 chip study between PTB patients and healthy subjects according to the criteria of “fold-change >2, original signal value >25, and no previous reports in the literature (Mitchell et al, 2008)” in the primary screening stage

  • In the model training stage, a binary logistic regression model for the differential diagnosis of TB was constructed by combining candidate lncRNAs with electronic health record (EHR) data using patients with nonTB lung disease as a control; the optimal model was further visualized as a nomogram

Read more

Summary

Introduction

Tuberculosis (TB) is a major global infectious disease caused by Mycobacterium tuberculosis (MTB) infection that poses a serious risk to human health (World Health Organization [WHO], 2020). Non-coding RNAs (ncRNAs) are produced during the transcription of the human genome into primary transcripts (Djebali et al, 2012), and they have greater tissue and spatiotemporal specificity than mRNAs and are involved in the body’s immune response and pathological damage processes in multiple ways (Distefano, 2018; Momen-Heravi and Bala, 2018). Host long non-coding RNAs (lncRNAs), a major subtype of ncRNAs, have potential as early molecular markers of TB. The existing findings are insufficient in improving the early diagnosis of TB, and there is a need to screen for more lncRNA markers of TB specific to different populations. Pathogenic testing for tuberculosis (TB) is not yet sufficient for early and differential clinical diagnosis; we investigated the potential of screening long noncoding RNAs (lncRNAs) from human hosts and using machine learning (ML) algorithms combined with electronic health record (EHR) metrics to construct a diagnostic model

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call