Abstract

When designing experimental studies with human participants, experimenters must decide how many trials each participant will complete, as well as how many participants to test. Most discussion of statistical power (the ability of a study design to detect an effect) has focused on sample size, and assumed sufficient trials. Here we explore the influence of both factors on statistical power, represented as a 2-dimensional plot on which iso-power contours can be visualized. We demonstrate the conditions under which the number of trials is particularly important, that is, when the within-participant variance is large relative to the between-participants variance. We then derive power contour plots using existing data sets for 8 experimental paradigms and methodologies (including reaction times, sensory thresholds, fMRI, MEG, and EEG), and provide example code to calculate estimates of the within- and between-participants variance for each method. In all cases, the within-participant variance was larger than the between-participants variance, meaning that the number of trials has a meaningful influence on statistical power in commonly used paradigms. An online tool is provided (https://shiny.york.ac.uk/powercontours/) for generating power contours, from which the optimal combination of trials and participants can be calculated when designing future studies.

Highlights

  • Statistical power is the ability of a study design with a given sample size to detect an effect of a particular magnitude

  • We present the rationale for incorporating the number of measurements into calculations of statistical power in experimental studies of psychology and human neuroscience

  • Power contour plots can be generated by subsampling existing data sets or using an online tool, and permit researchers to make informed choices about how many participants to test, and how long to test each one for, at the study design stage

Read more

Summary

Introduction

Statistical power is the ability of a study design with a given sample size to detect an effect of a particular magnitude. Low powered studies are less able to detect a true effect (and so make more Type II errors) compared with high powered studies. Any real effects that are detected are likely to have inflated effect sizes (Colquhoun, 2014; Ioannidis, 2008). These problems are common across many scientific disciplines, and estimates of power across studies in the neurosciences (Button et al, 2013) yield power values in the range 8%-30%, far below the desired level of ≥ 80%. There is a second degree of freedom available to many experimenters at the study design stage – the number of repetitions (or trials) of a given experimental condition by each participant

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call