Abstract

This article addresses the issue of misclassification in a single categorical variable, that is, how to test whether the collected categorical data are misclassified. To tackle this issue, a pair of null and alternative hypotheses is proposed. A mixed Bayesian approach is taken to test these hypotheses. Specifically, a bias-adjusted cell proportion estimator is presented that accounts for the bias caused by classification errors in the observed categorical data. The chi-square test is then adjusted accordingly. To test the null hypothesis that the data are not misclassified under a specified multinomial distribution against the alternative hypothesis they are misclassified, the Bayes factor is calculated for the observed data and a comparison is made with the classical p-value.

Highlights

  • The problem of misclassification is a major issue in observational epidemiologic studies

  • To test the null hypothesis that the data are not misclassified under a specified multinomial distribution against the alternative hypothesis they are misclassified, the Bayes factor is calculated for the observed data and a comparison is made with the classical p-value

  • It was found that the Bayes factor existed for the treated group under scenario I, but not under scenario II, whereas for the control group it exists under both scenarios

Read more

Summary

Introduction

The problem of misclassification is a major issue in observational epidemiologic studies. Not long after Bross (1954) pointed out that the non-differential misclassification would bias the corrected odds ratio toward the null hypothesis, Diamond and Lilienfeld (1962a-b) has extended the result to various types of epidemiologic studies. Almost no authors pay attention to investigate the effect of misclassification in the analysis of a single categorical variable except Mote and Anderson (1965). Mote and Anderson primarily takes a deductive approach to account for the bias caused by the classification errors. The shortcoming with a deductive approach is that it does not take the sampling errors into consideration. The issue on how to deal with the misclassification in the analysis of categorical data still remains unsolved

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.