Abstract

This issue includes an article (1) on sample size estimation for receiver operating characteristic (ROC) studies. ROC studies are widely used to assess medical imaging systems and several such studies have appeared in this journal. Sample size estimation is an important consideration when planning an ROC study. It is well known that underpowered studies represent a waste of resources, besides subjecting patients to unnecessary procedures in a study of doubtful scientific value. Funding agencies and institutional review boards usually require the applicant to show that a contemplated study has adequate power. However, in spite of these exhortations, it is probably true that themajority of publishedROC studies are underpowered. The problem is not unique to our field, because underpowering is but one reason why a study cannot be replicated (2). As academic radiologists, we take pride in being especially receptive to methodology issues that affect our discipline. In a typical ROC study, a representative sample of radiologist observers interpret a representative sample of cases in the two modalities being compared. A figure of merit (eg, area under the ROC curve), is calculated for each observer–modality combination. For eachmodality, the figures ofmerit are averaged over all readers, and the difference is calculated—this is the observed effect size. The statistical analysis assigns a P value that chance could have accounted for the observed effect size. If the P value is less than 5%, then the study has ‘‘succeeded’’ and the new modality may become clinically accepted. However, what does one do if the observed effect size is not significant?Assuming that the observed effect size is ‘‘encouraging,’’ the next step is to estimate the numbers of readers and cases for a new study in order to have a reasonable chance of obtaining a significant finding. Sample size estimation amounts essentially to estimating the variability of the observed effect size. The challenge is that when one calculates a figure of merit, as in the area under the ROCcurve, there is a large data reduction. For example, a study with six readers, 200 cases and two modalities yields only 12 numbers, the figures of merit of each of the six readers in the two modalities. The Dorfman, Berbaum, and Metz (DBM) approach (3) depends on the construct of the jackknife pseudovalue. Now, instead of only 12 numbers, one has 12 200 (ie, 2400 pseudovalues to analyze, a statistician’s dream!). However,

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.