To assess the reliability of lumbar facet arthropathy evaluation with computed tomography (CT) or magnetic resonance imaging (MRI) in patients with and without lumbar disc prosthesis and to estimate the reliability for individual CT and MRI findings indicating facet arthropathy. Metal-artifact reducing CT and MRI protocols were performed at follow-up of 114 chronic back pain patients treated with (n = 66) or without (n = 48) lumbar disc prosthesis. Three experienced radiologists independently rated facet joint space narrowing, osteophyte/hypertrophy, erosions, subchondral cysts, and total grade facet arthropathy at each of the three lower lumbar levels on both CT and MRI, using Weishaupt et al's rating system. CT and MRI examinations were randomly mixed and rated independently. Findings were dichotomized before analysis. Overall kappa and (due to low prevalence) prevalence- and bias-adjusted kappa were calculated to assess interobserver agreement. Interobserver agreement on total grade facet arthropathy was moderate at all levels with CT (kappa 0.47-0.48) and poor to fair with MRI (kappa 0.20-0.32). Mean prevalence- and bias-adjusted kappa was lower for osteophyte/hypertrophy versus other individual findings (CT 0.58 versus 0.79-0.86, MRI 0.35 versus 0.81-0.90), higher with CT versus MRI when rating osteophyte/hypertrophy (0.58 versus 0.35) and total grade facet arthropathy (0.54 versus 0.31), and generally similar at levels with versus levels without disc prosthesis. Interobserver agreement on facet arthropathy was moderate with CT and better with CT than with MRI. Disc prosthesis did not influence agreement. A more reliable grading of facet arthropathy requires a more consistent evaluation of osteophytes/hypertrophy. • In this study, interobserver agreement on facet arthropathy (FA) severity-based on facet joint space narrowing, osteophyte/hypertrophy, erosions, and subchondral cysts-was better with CT versus MRI. • Metal-artifact reducing CT and MRI protocols helped to improve visibility and maintain agreement when evaluating severity of FA at levels with metallic disc prosthesis. • Agreement was poorer for severity of osteophytes/hypertrophy than for the other evaluated FA findings; improved agreement on total grade FA evaluated with CT or MRI thus requires more consistent grading of osteophytes/hypertrophy between different radiologists.
Read full abstract