Abstract

Pseudoreplication occurs when the number of measured values or data points exceeds the number of genuine replicates, and when the statistical analysis treats all data points as independent and thus fully contributing to the result. By artificially inflating the sample size, pseudoreplication contributes to irreproducibility, and it is a pervasive problem in biological research. In some fields, more than half of published experiments have pseudoreplication – making it one of the biggest threats to inferential validity. Researchers may be reluctant to use appropriate statistical methods if their hypothesis is about the pseudoreplicates and not the genuine replicates; for example, when an intervention is applied to pregnant female rodents (genuine replicates) but the hypothesis is about the effect on the multiple offspring (pseudoreplicates). We propose using a Bayesian predictive approach, which enables researchers to make valid inferences about biological entities of interest, even if they are pseudoreplicates, and show the benefits of this approach using two in vivo data sets.

Highlights

  • Pseudoreplication occurs when the number of measured values or data points exceeds the number of genuine replicates, and when the statistical analysis treats all data points as independent and fully contributing to the result

  • Statisticians and quantitative biologists have worried about pseudoreplication – known as the unit-of-anlysis problem[1]

  • The controversy may persist because formulating a scientific question as statistical model can be difficult if one has received little formal training in experimental design or statistics – like many laboratory-based biologists

Read more

Summary

Introduction

Pseudoreplication occurs when the number of measured values or data points exceeds the number of genuine replicates, and when the statistical analysis treats all data points as independent and fully contributing to the result. The two main ways of dealing with pseudoreplication are: (1) average the pseudoreplicates to obtain one value per genuine replicate, or (2) use a more sophisticated approach that captures the structure of the data where the pseudoreplicates are nested under the genuine replicates, such as a multilevel/hierarchical model[8,9,10,11]. These recommended methods have lead to a key point of contention. There is a renewed interest in prediction[16,17,18,19,20], perhaps due to the attention that machine learning is receiving

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.