Abstract

We want to thank Drs Welsh and Henry for their interest in our work and for their interesting observations. We are delighted that our review and previous correspondence1 have achieved their main objective: to start a discussion about the reproducibility of ultrasound, an important issue that is usually overlooked, or even avoided, by both researchers and clinicians. The main message of this ‘Correspondence’ from Drs Welsh and Henry is that we should incorporate known physiological variation in the interpretation of the intraclass correlation coefficient (ICC). Although we agree that this is a valid argument, we believe that the overall variability, considering all sources of error, is more important for clinicians; additionally, considering too many sources of input will make the interpretation of results much more difficult. Although this topic deserves more attention, we will use this opportunity to discuss other points that we believe to be more relevant. One important concern of ours regards which statistical estimates we should use when examining reproducibility. Frequently, either the ICC or the concordance correlation coefficient (CCC) is used in reproducibility studies to examine the reliability of a measurement. Roughly, the reliability is represented by the proportion of the total variability that can be attributed to the ‘true’ variability between individuals, i.e. the ability of a measurement to differentiate between the examined subjects2. Although these estimates are useful, they are difficult to understand properly. For example, in their Correspondence, Drs Welsh and Henry have demonstrated that they do not completely comprehend the meaning of ICC, despite having published several articles using this estimate3-5: they argued that if physiological variation is in the order of ± 10%, then the ICC would be restricted to a maximum of 0.90. Although such logic is tempting, it is wrong: as mentioned earlier, the ICC depends on the proportion of the variability that can be attributed to the ‘true’ variation between individuals, while the other sources of variability might be attributed to both physiological variation and random error during measurements. To demonstrate why these authors were misled, we estimated the ICC assuming a ‘physiological variation’ of ± 10% and no random error from measurements in three different scenarios: in the first scenario, the maximum ‘true’ value is only 50% higher than the minimum value (e.g. ranging from 1.0 to 1.5); in the second scenario, the maximum ‘true’ value is 100% higher than the minimum value (e.g. ranging from 1.0 to 2.0); in the last scenario, the maximum ‘true’ value is 900% higher than the minimum value (e.g. ranging from 1.0 to 10.0). In the first scenario, the resulting ICC would be approximately 0.90; however, in the second scenario, the resulting ICC would be 0.95, while in the last scenario it would be 0.99. Because ICC and CCC are complex to understand, we fear they are not very useful and might even cause some confusion for most researchers and clinicians. In our opinion, reproducibility should be interpreted mainly on the basis of the limits of agreement (LoA) of the relative difference between measurements. This concept is straightforward and can be further simplified to facilitate its interpretation. Actually, when Drs Welsh and Henry stated that physiological variation might be in the order of ± 10%, they were using a simplified concept of the LoA of the relative differences. By using such a simplified version, perhaps it will become evident that the interpretation of reproducibility as being good, moderate or poor is not needed, since the concept is very easy to apply directly in clinical practice: we could report our measurements with the LoA, which could be interpreted as the margin of error of the exam, or even incorporate the expected random errors in the cut-off values to be used for clinical decisions. All of these points must be discussed extensively among people immersed in ultrasound research/practice and we should aim to reach a consensus regarding how to conduct and interpret reproducibility studies. We believe that, so far, this has not been done properly; however, there is now an ongoing initiative focused on this issue: the ‘True Reproducibility of UltraSound Techniques’ (TRUST) (http://trust-statement.org/). W. P. Martins*† and C. O. Nastri†‡ †Department of Obstetrics and Gynecology, Ribeirao Preto Medical School, University of Sao Paulo (DGO-FMRP-USP), Ribeirao Preto, Brazil; ‡School of Health Technology – Ultrasonography School of Ribeirao Preto (FATESA-EURP), Ribeirao Preto, Brazil *Correspondence. (Av. Bandeirantes, 3900–8 andar - HCRP - Campus Universitario, Ribeirao Preto, Sao Paulo, Brazil 14048-900 e-mail: [email protected])

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call