Abstract

IntroductionCohen’s kappa is often used to quantify the agreement between two pathologists. Nevertheless, a high prevalence of the feature of interest can lead to seemingly paradoxical results, such as low Cohen’s kappa values despite high “observed agreement.” Here, we investigate Cohen’s kappa using data from histologic subtyping assessment of lung adenocarcinomas and introduce alternative measures that can overcome this “kappa paradox.” MethodsA total of 50 frozen sections from stage I lung adenocarcinomas less than or equal to 3 cm in size were independently reviewed by two pathologists to determine the absence or presence of five histologic patterns (lepidic, papillary, acinar, micropapillary, solid). For each pattern, observed agreement (proportion of cases with concordant “absent” or “present” ratings) and Cohen’s kappa were calculated, along with Gwet’s AC1. ResultsThe prevalence of any amount of the histologic patterns ranged from 42% (solid) to 97% (acinar). On the basis of Cohen’s kappa, there was substantial agreement for four of the five patterns (lepidic, 0.65; papillary, 0.67; micropapillary, 0.64; solid, 0.61). Acinar had the lowest Cohen’s kappa (0.43, moderate agreement), despite having the highest observed agreement (88%). In contrast, Gwet’s AC1 values were close to or higher than Cohen’s kappa across patterns (lepidic, 0.64; papillary, 0.69; micropapillary, 0.71; solid, 0.73; acinar, 0.85). The proportion of positive versus negative agreement was 93% versus 50% for acinar. ConclusionsGiven the dependence of Cohen’s kappa on feature prevalence, interrater agreement studies should include complementary indices such as Gwet’s AC1 and proportions of specific agreement, especially in settings with a high prevalence of the feature of interest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call