It is still very difficult to diagnose pertussis based on a doctor's experience. Our aim is to develop a model based on machine learning algorithms combined with biochemical blood tests to diagnose pertussis. A total of 295 patients with pertussis and 295 patients with non-pertussis lower respiratory infections between January 2022 and January 2023, matched for age and gender ratio, were included in our study. Patients underwent a reverse transcription polymerase chain reaction test for pertussis and other viruses. Univariate logistic regression analysis was used to screen for clinical and blood biochemical features associated with pertussis. The optimal features and 3 machine learning algorithms including K-nearest neighbor, support vector machine, and eXtreme Gradient Boosting (XGBoost) were used to develop diagnostic models. Using univariate logistic regression analysis, 18 out of the 27 features were considered optimal features associated with pertussis The XGBoost model was significantly superior to both the support vector machine model (Delong test, P = .01) and the K-nearest neighbor model (Delong test, P = .01), with the area under the receiver operating characteristic curve of 0.96 and an accuracy of 0.923. Our diagnostic model based on blood biochemical test results at admission and XGBoost algorithm can help doctors effectively diagnose pertussis.
Read full abstract