Data quality and statistics: Perfect together?

Thomas Redman,Roger Hoerl

doi:10.1080/08982112.2022.2103432

Abstract

At first blush, it appears that statistics and data quality should be perfect together. After all, statistical practitioners depend on high-quality data to conduct their analyses, and many data quality efforts seem well-suited to the application of statistical methods. Yet, the facts on the ground suggest otherwise: statistical applications, quality improvement projects, and data science initiatives are all plagued by bad data (Kenett and Redman 2019). Worse still, these respective communities have too often viewed data quality as uninteresting “grunt work,” and shown little interest in systematic improvement. But why? This article explores this quandary in detail, diagnoses the root causes, and shows that resolving these causes presents an enormous opportunity.

Full Text