To determine the reliability of the National Institutes of Health (NIH)-modified semiquantitative histologic scoring system for lupus nephritis. Cross-sectional study, repeated after 8 to 9 months. Four community hospitals and one university medical center. Five pathologists, all experienced in reading renal biopsy specimens, assessed 25 specimens that had been obtained from patients with a clinical diagnosis of systemic lupus erythematosus and showed diffuse proliferative glomerulonephritis. Biopsy specimens were scored independently and blindly by pathologists for components of nephritis chronicity and activity. Reliability was measured by percentage agreement, intraclass correlation coefficient or kappa statistic, and individual reader effect on the group arithmetic mean. As scored by the readers, the mean chronicity index score varied from 2.3 to 4.8 on a 12-point scale (P = 0.001) and the mean activity index score varied from 5.8 to 11.4 on a 24-point scale (P = 0.0001). Pairs of readers gave scores within 1 point for the chronicity index and within 2 points for the activity index in 50% of cases, and risk group assignments based on chronicity index (three strata) and activity index (two strata) were concordant in 59% and 76% of cases, respectively. Intraclass correlation coefficients for inter-reader agreement were 0.58 for the chronicity index (P < 0.01) and 0.52 for the activity index (P < 0.01). Intrareader agreement was uniformly higher than inter-reader agreement, but mean intraclass correlation coefficients exceeded 0.70 for only 1 of the 10 index components. Repeated readings yielded chronicity index scores that were more than 1 point discordant in 45% of cases and activity index scores that were more than 2 points discordant in 43% of cases. Risk group assignment changed on the basis of chronicity index and activity index in 36% and 21% of cases, respectively. In a nonreferral setting, the NIH-modified scoring system for lupus nephritis is only moderately reproducible and, if used to prognosticate renal outcome, may result in erroneous predictions of risk for renal failure and response to therapy.
Read full abstract