Abstract

Direct replication studies follow an original experiment’s methods as closely as possible. They provide information about the reliability and validity of an original study’s findings. The present paper asks what comparative cognition should expect if its studies were directly replicated, and how researchers can use this information to improve the reliability of future research. Because published effect sizes are likely overestimated, comparative cognition researchers should not expect findings with p-values just below the significance level to replicate consistently. Nevertheless, there are several statistical and design features that can help researchers identify reliable research. However, researchers should not simply aim for maximum replicability when planning studies; comparative cognition faces strong replicability-validity and replicability-resource trade-offs. Next, the paper argues that it may not even be possible to perform truly direct replication studies in comparative cognition because of: 1) a lack of access to the species of interest; 2) real differences in animal behavior across sites; and 3) sample size constraints producing very uncertain statistical estimates, meaning that it will often not be possible to detect statistical differences between original and replication studies. These three reasons suggest that many claims in the comparative cognition literature are practically unfalsifiable, and this presents a challenge for cumulative science in comparative cognition. To address this challenge, comparative cognition can begin to formally assess the replicability of its findings, improve its statistical thinking and explore new infrastructures that can help to form a field that can create and combine the data necessary to understand how cognition evolves.

Highlights

  • Comparative cognition is a broad field which investigates how animals acquire, process and use information (Beran et al, 2014; Shettleworth, 2009)

  • Some areas of comparative cognition do bear the hallmarks of science that has proven difficult to reproduce, namely small sample sizes, noisy measurements and unlikely hypotheses (Forstmeier et al, 2017), whereas other areas may be less affected by low replicability rates due to using research designs which typically show higher replicability, such as within-subjects designs in which there are many trials for each animal tested (Smith & Little, 2018)

  • While the paper focuses on what comparative cognition should expect from replication studies, and how it can use this knowledge to improve in the future, the arguments can be applied retrospectively to assess the evidential value of previously published findings

Read more

Summary

Section 1.1 – Simulation Study

Another way to view these ideas is through simulation studies, and to this end we simulated a very simple model of comparative cognition research (for details and code see the Appendix). These results are in the absence of any false positive inflating research practices (Fraser et al, 2018; John et al, 2012; Simmons et al, 2011); they are solely the consequence of only publishing research with p < .05 and performing at least some research with relatively low power This simulation does not accurately characterize the field of comparative cognition: Not all comparative cognition research is performed with 80%, 50%, 20% or 5% power, or using two-sample t-tests comparing two groups of 10 animals. One scenario in which these conclusions might be inappropriate is when considering research using designs and test combinations in which the p-value distribution is not uniform or near uniform under the null hypothesis, such as with a binomial test and a small number of observations

Conclusion
Findings
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call