Abstract

We develop a statistical model for the testing of disease prevalence in a population. The model assumes a binary test result, positive or negative, but allows for biases in sample selection and both type I (false positive) and type II (false negative) testing errors. Our model also incorporates multiple test types and is able to distinguish between retesting and exclusion after testing. Our quantitative framework allows us to directly interpret testing results as a function of errors and biases. By applying our testing model to COVID-19 testing data and actual case data from specific jurisdictions, we are able to estimate and provide uncertainty quantification of indices that are crucial in a pandemic, such as disease prevalence and fatality ratios.This article is part of the theme issue ‘Data science approach to infectious disease surveillance’.

Highlights

  • Real-time estimation of the level of infection in a population is important for assessing the severity of an epidemic as well as for guiding mitigation strategies.2021 The Authors

  • We develop a statistical model for the testing of disease prevalence in a population

  • Several previous studies have addressed the issue of correcting for errors and testing biases

Read more

Summary

Introduction

Real-time estimation of the level of infection in a population is important for assessing the severity of an epidemic as well as for guiding mitigation strategies. Test results are mainly reported as binary values (0 or 1, negative or positive) and often do not include further information such as the cycle threshold (Ct) for RT-PCR tests. For serological COVID-19 tests, the estimated proportions of false positives and false negatives are relatively low, with FPR ≈ 0.02−0.07 and FNR ≈ 0.02−0.16 [5,6,7,8]. Similar to serological tests, reported false-positive rates of RT-PCR tests are about FPR = 0.05 [7]. Estimates of disease prevalence and other surveillance metrics [14,15] need to account for FPRs and FNRs, in particular if reported positive-testing rates [16] are in the few percent range and potentially dominated by type I errors.

Related work
Statistical testing model
Inference of prevalence and application to COVID-19 data
Inference of bias b
Findings
Summary and conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.