Abstract

In this article, we consider the issue of reproducibility within the field of cognitive hearing science. First, we examine how retest reliability can provide useful information for the generality of results and intervention effectiveness. Second, we provide an overview of retest reliability coefficients within three areas of cognitive hearing science (cognition, speech perception, and self-reported measures of communication) and show how the reporting of these coefficients differs between fields. We argue that practices surrounding the provision of retest coefficients are currently most rigorous in clinical assessment and that basic science research would benefit from adopting similar standards. Finally, based on a distinction between direct replications (which aim to keep materials as close to the original study as possible) and conceptual replications (which test the same purported mechanism using different materials), we discuss new initiatives which address the need for both. Using the example of the auditory Stroop task, we provide practical illustrations of how these theoretical issues can be addressed within the context of a multi-lab replication study. By illustrating how theoretical concepts can be put into practice in empirical research, we hope to encourage others to set up and participate in a wide variety of reproducibility-related studies.

Highlights

  • Reproducibility is a core requirement for the accrual of scientific knowledge and the advancement of a field

  • Following the Hendrick (1991) and Schmidt (2009) framework, we suggest that this core package keeps the immaterial realization of the primary information focus constant while varying its material realization – in other words, it provides a conceptual replication by testing the robustness and replicability of results across different Stroop tasks

  • We have suggested that one way to improve reproducibility, when assessing individual differences, is to encourage researchers to include retest reliability measures of their quantitative assessment methods as a routine aspect of testing, analysis, and reporting

Read more

Summary

INTRODUCTION

Reproducibility is a core requirement for the accrual of scientific knowledge and the advancement of a field. Two levels of replication exist: individual-level replications, which concern the reproducibility of individual differences, often in the form of retest reliability; and group-level replications, which concern the reproducibility of effect sizes, often expressed as differences between group means The former refers to the similarity of an individual’s test scores at different points in time when no intervention has been applied. The ICC, on the other hand, explicitly estimates both systematic and random error, thereby allowing for a distinction between estimates of agreement and consistency (Aldridge et al, 2017) In this context, agreement refers to the extent to which observed raw scores obtained by a measurement tool for one individual match between raters or time-points in the absence of any actual (systematic) change in the outcome being measured. Regardless of the strategy used to control for error, calculated values for retest reliability are typically judged according to the guidance given by Cicchetti and Sparrow (1981): retest reliability below 0.40 is poor, between 0.40 and 0.59 is fair, between 0.60 and 0.74 is good, and 0.75 and above is excellent

A Group-Level View of Individual Differences
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call