To assess the reliability and reproducibility of AO/OTA, Frykman and Fernandez classification systems for distal radius fractures on CT. Four radiologists, including one radiology resident, two musculoskeletal radiology fellows and one radiology consultant independently evaluated CT scans of 115 patients with distal radius fractures and classified the fractures according to AO/OTA, Frykman and Fernandez classification system. To assess reproducibility, a second set of reading was done by two observers after an interval of six weeks. Interobserver reliability was calculated for each classification system using intraclass correlation coefficient (ICC) and using Light's modification of kappa. Intraobserver agreement was calculated using Cohen's kappa. Interobserver reliability using ICC showed fair agreement for AO/OTA (0.447) and Frykman (0.432) classification system and poor agreement for Fernandez (0.196) classification system. Interobserver agreement using kappa was moderate for AO/OTA fracture (0.447) classification into either of three types, while it was only slight for complete classification into type, group and subgroup (0.177). Interobserver agreement using kappa was slight for Fernandez (0.196) classification systems and moderate for Frykman classification system (0.406). Intraobserver agreement for AO/OTA classification system was moderate for observer 1 (0.449) and slight for observer 2 (0.162). Intraobserver agreement for Frykman classification system was substantial for observer 1(0.754) and moderate for observer 2 (0.496). Intraobserver agreement for Fernandez classification system was moderate for both the observers (0.333, 0.320). Currently there is no classification system that is fully reproducible. AO/OTA and Frykman classification systems performed better than Fernandez classification system in terms of interobserver reliability. However, Frykman classification system performed better than both AO/OTA and Fernandez classification system in terms of intraobserver reproducibility. Fernandez classification system had worst inter and intraobserver reliability in present study. Reliability and reproducibility of AO/OTA classification system decreased when fractures were divided into subgroups.
Read full abstract