Abstract

Background: To develop and validate machine learning models for data entry error detection in a national out-of-hospital cardiac arrest (OHCA) prehospital patient care report database. Methods: Adult OHCAs of presumed cardiac etiology were included. Data entry errors were defined as discrepancies between the coded data and the free-text note documenting the intervention or event; for example, information that was recorded as “absent” in the coded data but “present” in the free-text note. Machine learning models using the extreme gradient boosting, logistic regression, extreme gradient boosting outlier detection, and K-nearest neighbor outlier detection algorithms for error detection within nine core variables were developed and then validated for each variable. Results: Among 12,100 OHCAs, the proportion of cases with at least one error type was 16.2%. The area under the receiver operating characteristic curve (AUC) of the best-performing model (model with the highest AUC for each outcome variable) was 0.71–0.95. Machine learning models detected errors most efficiently for outcome place and initial rhythm errors; 82.6% of place errors and 93.8% of initial rhythm errors could be detected while checking 11% and 35% of data, respectively, compared to the strategy of checking all data. Conclusion: Machine learning models can detect data entry errors in care reports of EMS clincians with acceptable performance and likely can improve the efficiency of the process of data quality control. EMS organizations that provide more prehospital interventions for OHCA patients could have higher error rates and may benefit from the adoption of error-detection models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.