Abstract

Aim. To develop and validate a machine learning model designed to identify suspected pulmonary embolism (PE) based on various clinical features from electronic health records (EHRs) of out- and inpatients.Material and methods. Data from 19730 patients from 7 Russian regions were taken for analysis. EHR data were analyzed for the period from March 21, 2007 to February 4, 2022. Complaints, clinical and laboratory data, and concomitant diseases were used as diagnostic signs. PE was diagnosed in 1379 patients. Diagnosis of PE was based on ICD-10 codes. Seven machine learning algorithms were applied to diagnose pulmonary embolism: XGBoost, LightGBM, CatBoost, Logistic Regression, MLP Classifier, Random Forest Classifier, Gradient Boosting Classifier.Results. The Gradient Boosting Classifier-based model was selected for further prospective testing with the sensitivity of 0,899 (95% confidence interval (CI), 0,864-0,932), specificity of 0,875 (95% CI, 0,863-0,86), area under the ROC curve of 0,952 (95% CI, 0,938-0,964). The following signs had the greatest prediction value: cough, respiratory disorders, blood creatinine, body temperature, general weakness, heart rate, respiratory rate, edema, antihypertensive therapy, saturation and age.Conclusion. The model is designed for the initial encounter of patients with complaints and suspected PE, regardless of the type of care.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.