Abstract
The kappa coefficient (x) was developed by Cohen (1960) for the purpose of assessing chance-eorrected interrater agreement on nominal data. In its simplest form, )( is defined as follows: )( = (Po-Pc)/(1-pc), where Po is the observed proportion of agreement and P« is the chance or expected proportion of agreement. Over the past three decades, )( has been extended to allow assessment among K > 2 raters (e.g., Conger, 1980; Uebersax, 1982), to allow missing observations (e.g., Conger, 1980; Fleiss, 1971; Uebersax, 1982), to allow nonmutually exclusive response categories (i.e., subjects can be classified into more than one category; e.g., Fleiss, Spitzer, Endicott, & Cohen, 1972; Kraemer, 1980), and to allow for partial disagreement (Cohen, 1968). The sampling distribution of )( has also been identified, allowing variance estimates to be calculated and significance tests to be conducted (Fleiss, Cohen, & Everitt, 1969). Not surprisingly, several computer programs have been offered as efficient means of providing one or more of these )( features (e.g., Bloor, 1983; Burns & Cavallaro, 1982; Chan, 1987; Powers, 1985; Watkins & Larimer, 1980). The present program provides a more comprehensive package by including features found in these earlier programs, while including additional features as well. The program provides for the calculation of )( and weighted )( for any number of raters (subject to the memory constraints of the machine being used). When the number of raters is greater than 2, all pairwise x are also provided. Missing data in the matrix of observations are handled in the calculation of multiple-rater )( by using Uebersax's (1982) generalized approach. This approach is based on the calculation of weighted averages of pairwise estimates for observed agreement (Po) and chance agreement (Pc), each weighted by the number of subjects each pair of judges has in common. For weighted x, the user. suppli~s a weight matrix that indicates the degree of partial credit to be given for different types of disagreement. The careful construction of the weight matrix also allows the calculation of )( when subjects can be classified into more than one category. This arises, for example, in the case of medical or psychiatric diagnosis where more than one condition may be present in the same subject. Relative ordering of these multiple diagnoses is also allowed. Thus, when the order of primary and secondary diagnoses is critical (A-B * B-A), different orders must be considered separate categories with a weight assigned to reflect their partial disagreement.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
More From: Behavior Research Methods, Instruments, & Computers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.