Computer-aided assessment of diagnostic images for epidemiological research

Alison G Abraham,Stephen J Gange,Sheila West,Donald D Duncan

doi:10.1186/1471-2288-9-74

Abstract

BackgroundDiagnostic images are often assessed for clinical outcomes using subjective methods, which are limited by the skill of the reviewer. Computer-aided diagnosis (CAD) algorithms that assist reviewers in their decisions concerning outcomes have been developed to increase sensitivity and specificity in the clinical setting. However, these systems have not been well utilized in research settings to improve the measurement of clinical endpoints. Reductions in bias through their use could have important implications for etiologic research.MethodsUsing the example of cortical cataract detection, we developed an algorithm for assisting a reviewer in evaluating digital images for the presence and severity of lesions. Available image processing and statistical methods that were easily implementable were used as the basis for the CAD algorithm. The performance of the system was compared to the subjective assessment of five reviewers using 60 simulated images. Cortical cataract severity scores from 0 to 16 were assigned to the images by the reviewers and the CAD system, with each image assessed twice to obtain a measure of variability. Image characteristics that affected reviewer bias were also assessed by systematically varying the appearance of the simulated images.ResultsThe algorithm yielded severity scores with smaller bias on images where cataract severity was mild to moderate (approximately ≤ 6/16ths). On high severity images, the bias of the CAD system exceeded that of the reviewers. The variability of the CAD system was zero on repeated images but ranged from 0.48 to 1.22 for the reviewers. The direction and magnitude of the bias exhibited by the reviewers was a function of the number of cataract opacities, the shape and the contrast of the lesions in the simulated images.ConclusionCAD systems are feasible to implement with available software and can be valuable when medical images contain exposure or outcome information for etiologic research. Our results indicate that such systems have the potential to decrease bias and discriminate very small changes in disease severity. Simulated images are a tool that can be used to assess performance of a CAD system when a gold standard is not available.

Highlights

Diagnostic images are often assessed for clinical outcomes using subjective methods, which are limited by the skill of the reviewer
The absolute difference between the mean severity given to each image and the true severity indicated that the Computer-aided diagnosis (CAD) system outperformed the reviewers when the opacity was relatively mild (Figure 3)
The bias associated with the CAD system increased as the severity increased and we found the bias was consistently in the direction of underestimating the image severity

Summary

Introduction

Diagnostic images are often assessed for clinical outcomes using subjective methods, which are limited by the skill of the reviewer. The use of computer-aided diagnosis (CAD) systems to improve the sensitivity and specificity of lesion detection have become a focus of medical imaging and diagnostic radiology research [3] Such systems have been explored extensively as a method for improving the detection of breast cancers from mammography [4,5] and the evidence indicates CAD can improve the accuracy of detection [6]. Evaluation of such systems can be challenging since the quality of the images, the application and expertise of the user will all contribute to the detection performance Established methods such as receiver operating characteristic (ROC) analysis and freeresponse receiver operating characteristic (FROC) analysis can provide metrics for assessing performance given knowledge of the true disease classification. Biopsy can provide a gold standard (true tumor presence) for cancer diagnostics but simple gold standards for other image diagnostics or for outcomes other than presence of disease (e.g. disease progression) may be challenging to find

Methods

Results

Discussion

Conclusion