Abstract

Cohen's kappa (κ) is often recommended for nominal data as a measure of inter-rater (inter-coder) agreement or reliability. In this paper we ask which term is appropriate in genre analysis, what statistical measures are valid to measure it, and how much the choice of units affects the values obtained. We find that although both agreement and reliability may be of interest, only agreement can be measured with nominal data. Moreover, while kappa may be appropriate for macrostructure or corpus analysis, it is inappropriate for move or component analysis, due to the requirement of κ that the units be predetermined, fixed, and independent. κ further assumes that all disagreements in category assignment are equally likely, which may not be true. We also describe other measures, including correlation, chi square, and percent agreement, and demonstrate that despite its limitations, percent agreement is the only valid measure in many situations. Finally, we demonstrate why choice of unit has a large effect on the value calculated. These findings also apply to other studies in applied linguistics using nominal data. We conclude that the methodology used needs to be clearly explained to ensure that the requirements have been met, as in any other statistical testing.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.