Abstract

Biomedical research has come to rely on p-values as a deterministic measure for data-driven decision-making. In the largely extended null hypothesis significance testing for identifying statistically significant differences among groups of observations, a single p-value is computed from sample data. Then, it is routinely compared with a threshold, commonly set to 0.05, to assess the evidence against the hypothesis of having non-significant differences among groups, or the null hypothesis. Because the estimated p-value tends to decrease when the sample size is increased, applying this methodology to datasets with large sample sizes results in the rejection of the null hypothesis, making it not meaningful in this specific situation. We propose a new approach to detect differences based on the dependence of the p-value on the sample size. We introduce new descriptive parameters that overcome the effect of the size in the p-value interpretation in the framework of datasets with large sample sizes, reducing the uncertainty in the decision about the existence of biological differences between the compared experiments. The methodology enables the graphical and quantitative characterization of the differences between the compared experiments guiding the researchers in the decision process. An in-depth study of the methodology is carried out on simulated and experimental data. Code availability at https://github.com/BIIG-UC3M/pMoSS.

Highlights

  • Biomedical research has come to rely on p-values as a deterministic measure for data-driven decisionmaking

  • The authors provide a detailed description of the drawbacks of null hypothesis significance testing (NHST) applied to large datasets and they suggest the use of confidence intervals (CI) and effect sizes as alternative measures

  • We describe the non-published dataset that corresponds to the first real application example in the reported results

Read more

Summary

Introduction

Biomedical research has come to rely on p-values as a deterministic measure for data-driven decisionmaking. In the largely extended null hypothesis significance testing for identifying statistically significant differences among groups of observations, a single p-value is computed from sample data. It is routinely compared with a threshold, commonly set to 0.05, to assess the evidence against the hypothesis of having non-significant differences among groups, or the null hypothesis. The ability to acquire, store and disseminate large amounts of data is constantly improving in life-science laboratories Having such big datasets available for multiple kinds of analysis supports the proliferation of many different new methodologies for their study. When large sample sizes are available, life-scientists could detect statistically significant evidence against the null hypothesis through a small enough p-value, even though there are no interesting differences from the practical point of view. To the best of our knowledge, there are no methods that exploit the sample-size-dependence of the p-value to derive interpretable parameters to assess the existence of interesting differences from the biological or clinical perspective

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call