Abstract
The accurate and prompt diagnosis of infections is essential for improving patient outcomes and preventing bacterial drug resistance. Host gene expression profiling as an approach to infection diagnosis holds great potential in assisting early and accurate diagnosis of infection. To improve the precision of infection diagnosis, we developed InfectDiagno, a rank-based ensemble machine learning algorithm for infection diagnosis via host gene expression patterns. Eleven data sets were used as training data sets for the method development, and the InfectDiagno algorithm was optimized by multi-cohort training samples. Nine data sets were used as independent validation data sets for the method. We further validated the diagnostic capacity of InfectDiagno in a prospective clinical cohort. After selecting 100 feature genes based on their gene expression ranks for infection prediction, we trained a classifier using both a noninfected-vs-infected area under the receiver-operating characteristic curve (area under the curve [AUC] 0.95 [95% CI, 0.93-0.97]) and a bacterial-vs-viral AUC 0.95 (95% CI, 0.93-0.97). We then used the noninfected/infected classifier together with the bacterial/viral classifier to build a discriminating infection diagnosis model. The sensitivity was 0.931 and 0.872, and specificity 0.963 and 0.929, for bacterial and viral infections, respectively. We then applied InfectDiagno to a prospective clinical cohort (n = 517), and found it classified 95% of the samples correctly. Our study shows that the InfectDiagno algorithm is a powerful and robust tool to accurately identify infection in a real-world patient population, which has the potential to profoundly improve clinical care in the field of infection diagnosis.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have