Accounting for Data Errors Discovered from an Audit in Multiple Linear Regression

Bryan E Shepherd,Chang Yu

doi:10.1111/j.1541-0420.2010.01543.x

Abstract

A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accounting for Data Errors Discovered from an Audit in Multiple Linear Regression

Abstract

Talk to us

Similar Papers

More From: Biometrics

Lead the way for us

Journal: Biometrics	Publication Date: Jan 31, 2011
Citations: 28

Similar Papers

Pengaruh Store Image Toko Ikan Hias Terhadap Keputusan Pembelian Konsumen
Indah Handaruwati
Business Innovation and Entrepreneurship Journal | VOL. 3
Indah HandaruwatiIndah Handaruwati
28 Feb 2021
Business Innovation and Entrepreneurship Journal | VOL. 3

Statistical Analysis of Data Fom Infertility Patients: How to Explicitly Consider the Decline in Fertility Associated With Age
H Grotjan ... M.L Uhler
Fertility and Sterility | VOL. 84
H Grotjan, et. al.H Grotjan ... M.L Uhler
01 Sep 2005
Fertility and Sterility | VOL. 84

<title>Confidence intervals for ATR performance metrics</title>
Timothy D Ross
-
Timothy D RossTimothy D Ross
27 Aug 2001
27 Aug 2001

Small area estimation with multiple covariates under structural measurement error models
Ita Wulandari ... Anwar Fitrianto
Procedia Computer Science | VOL. 216
Ita Wulandari, et. al.Ita Wulandari ... Anwar Fitrianto
01 Jan 2023
Procedia Computer Science | VOL. 216

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accounting for Data Errors Discovered from an Audit in Multiple Linear Regression

Abstract

Talk to us

Similar Papers

More From: Biometrics