Abstract

AbstractIntroductionThe rapid adoption of electronic health record (EHR) systems has resulted in extensive archives of data relevant to clinical research, hospital operations, and the development of learning health systems. However, EHR data are not frequently available, cleaned, standardized, validated, and ready for use by stakeholders. We describe an in‐progress effort to overcome these challenges with cooperative, systematic data extraction and validation.MethodsA multi‐disciplinary team of investigators collaborated to create the Complete Inpatient Record Using Comprehensive Electronic Data (CIRCE) Project dataset, which captures EHR data from six hospitals within the University of Pennsylvania Health System. Analysts and clinical researchers jointly iteratively reviewed SQL queries and their output to validate desired data elements. Data from patients aged ≥18 years with at least one encounter at an acute care hospital or hospice occurring since 7/1/2017 were included. The CIRCE Project includes three layers: (1) raw data comprised of direct SQL query output, (2) cleaned data with errors removed, and (3) transformed data with standardized implementations of commonly used case definitions and clinical scores.ResultsBetween July 1, 2017 and December 31, 2023, the dataset captured 1 629 920 encounters from 740 035 patients. Most encounters were emergency department only visits (n = 965 834, 59.3%), followed by inpatient admissions without an intensive care unit admission (n = 518 367, 23.7%). The median age was 46.9 years (25th–75th percentiles = 31.1–64.7) at the time of the first encounter. Most patients were female (n = 418 303, 56.5%), a significant proportion were of non‐White race (n = 272 018, 36.8%), and 54 625 (7.4%) were of Hispanic/Latino ethnicity.ConclusionsThe CIRCE Project represents a novel cooperative research model to capture clinically validated EHR data from a large diverse academic health system in the greater Philadelphia region and is designed to facilitate collaboration and data sharing to support learning health system activities. Ultimately, these data will be de‐identified and converted to a publicly available resource.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call