Equivalence Testing and the Second Generation P-Value

Daniël Lakens,Marie Delacre

doi:10.15626/mp.2018.933

Abstract

   To move beyond the limitations of null-hypothesis tests, statistical approaches have been developed where the observed data are compared against a range of values that are equivalent to the absence of a meaningful effect. Specifying a range of values around zero allows researchers to statistically reject the presence of effects large enough to matter, and prevents practically insignificant effects from being interpreted as a statistically significant difference. We compare the behavior of the recently proposed second generation p-value (Blume, D’Agostino McGowan, Dupont, & Greevy, 2018) with the more established Two One-Sided Tests (TOST) equivalence testing procedure (Schuirmann, 1987). We show that the two approaches yield almost identical results under optimal conditions. Under suboptimal conditions (e.g., when the confidence interval is wider than the equivalence range, or when confidence intervals are asymmetric) the second generation p-value becomes difficult to interpret. The second generation p-value is interpretable in a dichotomous manner (i.e., when the SGPV equals 0 or 1 because the confidence intervals lies completely within or outside of the equivalence range), but this dichotomous interpretation does not require calculations. We conclude that equivalence tests yield more consistent p-values, distinguish between datasets that yield the same second generation p-value, and allow for easier control of Type I and Type II error rates.  

Highlights

To move beyond the limitations of null-hypothesis tests, statistical approaches have been developed where the observed data are compared against a range of values that are equivalent to the absence of a meaningful effect
Because one-sided tests are performed, one can conclude equivalence by checking whether the 1-2×α confidence interval falls completely within the equivalence bounds. Because both equivalence tests as the Second generation p-values (SGPV) are based on whether and how much a confidence interval overlaps with equivalence bounds, it seems worthwhile to compare the behavior of the newly proposed SGPV to equivalence tests to examine the unique contribution of the SGPV to the statistical toolbox
When we discuss the relationship between the p-values from Two One-Sided Tests (TOST) and the SGPV, we focus on their correspondence at three values, namely where the TOST p = 0.025 and SGPV is 1, where the TOST p = 0.5 and SGPV = 0.5, and where the TOST p = 0.975 and SGPV = 1

Summary

SGPV as a uniform measure of overlap

It is clear the SGPV and the p-value from TOST are closely related. When confidence intervals are symmetric we can think of the SGPV as a straight line that is directly related to the p-value from an equivalence test for three values. An important issue when calculating the SGPV is its reliance on the “small sample correction”, where the SGPV is set to 0.5 whenever the ratio of the confidence interval width to the equivalence range exceeds 2:1 and the CI overlaps with the upper and lower bounds. This exception to the normal calculation of the SGPV is introduced to prevent misleading values.

Conclusion

Authors Note

Findings

Open Science Practices

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Meta-Psychology	Publication Date: Jul 13, 2020
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Equivalence Testing and the Second Generation P-Value

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Meta-Psychology

Lead the way for us

Similar Papers

Need for equivalence testing of efficacy of alternative antibiotics for treatment of pertussis.
Andrew L Baughman ... Kristine M Bisgard
The Pediatric infectious disease journal | VOL. 22
Andrew L Baughman, et. al.Andrew L Baughman ... Kristine M Bisgard
01 Feb 2003
The Pediatric infectious disease journal | VOL. 22

Robustness of statistical methods when measure is affected by ceiling and/or floor effect.
Matúš Šimkovic ... Alan D Hutson
PLOS ONE | VOL. 14
Matúš Šimkovic, et. al.Matúš Šimkovic ... Alan D Hutson
19 Aug 2019
PLOS ONE | VOL. 14

Comparison of dissolution profile of extended-release oral dosage forms - two one-sided equivalence test
Felipe Rebello Lourenço ... Terezinha De Jesus Andreoli Pinto
Brazilian Journal of Pharmaceutical Sciences | VOL. 49
Felipe Rebello Lourenço, et. al.Felipe Rebello Lourenço ... Terezinha De Jesus Andreoli Pinto
01 Jun 2013
Brazilian Journal of Pharmaceutical Sciences | VOL. 49

Finite sample corrections for average equivalence testing.
Younes Boulaguiem ... Maria-Pia Victoria-Feser
Statistics in Medicine | VOL. 43
Younes Boulaguiem, et. al.Younes Boulaguiem ... Maria-Pia Victoria-Feser
19 Dec 2023
Statistics in Medicine | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Equivalence Testing and the Second Generation P-Value

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Meta-Psychology