Abstract

Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.

Highlights

  • In a lot of quantitative research like the medical and social sciences, two-sample tests like Student’s t-test are among the most widely carried out statistical proceduresR

  • This paper reviewed some recently developed Bayesian parametric and nonparametric two-sample tests as possible alternatives to null hypothesis significance tests which are usually applied

  • The traditional frequentist solutions make use of null hypothesis significance testing, which suffers from several well-known problems

Read more

Summary

Introduction

In a lot of quantitative research like the medical and social sciences, two-sample tests like Student’s t-test are among the most widely carried out statistical procedures. Null hypothesis significance tests which employ p values are prone to inflate false-positive error rates if the distributional assumptions are violated (Rochon et al 2012), if optional stopping rules are applied (Kruschke and Liddell 2018b; Berger and Wolpert 1988), or the study conducted is underpowered (McElreath and Smaldino 2015) To mitigate these problems, a lot of research has been carried out in the last decade on developing Bayesian counterparts to popular frequentist two-sample tests like Student’s t-test and the Mann–Whitney U test (van Doorn et al 2020; Gönen et al 2005; Wetzels et al 2009; Wang and Liu 2016; Gronau et al 2019). Due to the progress made, Bayesian versions of two-sample tests have become more popular in recent literature, and in this paper, these are reviewed and contrasted with their frequentist counterparts

Frequentist two-sample tests
Student’s two-sample t-test
Bayesian two-sample tests
A parametric Bayesian two-sample t-test
Episodic memory performance and lateral eye movement
Frequentist analysis
Bayesian analysis
Discussion
Simulation study
Type I error rates
Conclusion
Compliance with ethical standards

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.