A general goodness-of-fit approach for inference procedures concerning the kappa statistic.

Mekibib Altaye,Allan Donner,Michael Eliasziw

doi:10.1002/sim.911

Abstract

The kappa statistic is frequently used as a measure of agreement among two or more raters. Although considerable research on statistical inferences for this statistic has been published for the case of two raters and a binary outcome, relatively little work has appeared on inference problems for the case of multiple raters and/or polytomous nominal outcome categories. In this paper we propose a new procedure for constructing inferences for the kappa statistic that may be applied to this general case. The procedure is based on a chi-square goodness-of-fit test as applied to the Dirichlet multinomial model, and is a natural extension of previously proposed procedures that apply to more restricted cases. A simulation study shows that the new procedure provides confidence interval coverage levels and type I error rates close to nominal over a wide range of parameter combinations. We also present a sample size formula which may be used to determine the required number of subjects and raters for a given number of outcome categories.

Full Text