Abstract

To describe the methodology for the development of data quality metrics in multi-institutional databases, deriving a cumulative data quality score [Aggregate Data Quality score (ADQ)]. The ESTS database was used to create and apply the metrics. The Units contributing to the ESTS database were ranked for the quality of data uploaded using the ADQ. We analysed data obtained from 96 Units contributing with at least 100 major lung resections (January 2007 to December 2014). The Units were anonymized assigning a casual numeric code. The following metrics were developed for measuring the data quality of each Unit: (i) record Completeness (COM); rate of present variables on 16 expected variables for all the records uploaded [1 - ('null values'/total expected values for the Unit) × 100, the concept of 'null value' was defined for each variable]; (ii) record Reliability (REL); rate of consistent checks on 9 checks tested for all the records uploaded [1 - (valid controls/total possible controls for the Unit) × 100, specific reliability control queries were defined]. These two metrics were rescaled using the mean and standard deviation of the entire dataset and summed, obtaining: (iii) ADQ score: [COM rescaled + REL rescaled]; it measures the cumulative data quality of a given dataset. The ADQ was used to rank the contributors. The COM of ESTS database contributors varied from 98.6 to 43% and the REL from 100 to 69%. Combining the rescaled metrics, the obtained ADQ ranged between 2.67 (highest data quality) and -7.85 (lowest data quality). Comparing the rating using just the COM value to the one obtained using the ADQ, 93% of Units changed their position. The major change was the drop of 66 positions considering the ADQ list. We described a reproducible method for data quality assessment in clinical multi-institutional databases. The ADQ is a unique indicator able to describe data quality and to compare it among centres. It has the potential of objectively guiding projects of data quality management and improvement.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call