Abstract

Reproducibility and replicability play a pivotal role in science. The article reflects on reproducibility and replicability as they figure in large scale genome-wide association studies. Overall, we emphasize the importance of enhancing data reproducibility, analysis reproducibility, and result replicability. We make recommendations pertaining to the development of study designs that address 1) batch effects and selection bias, 2) the incorporation of discrete discovery and replication phases, and 3) the procurement of a large sample size. We emphasize the importance of systematic and transparent data generation, processing, and quality control pipelines, as well as a rigorous field-specific standardized analysis protocol, We offer guidance with respect to collaborative frameworks, open access analysis tools, and software, and the use of supporting mandates, infrastructure, and repositories for data and resource sharing. Finally, we identify the role of incentives and culture in fueling the production of reproducible and replicable research through partnerships of researchers, funding agencies, and journals.

Highlights

  • The US National Academy of Sciences, Engineering, and Medicine published a comprehensive report on Reproducibility and Replicability in Science (National Academy of Sciences, 2019)

  • The article reflects on reproducibility and replicability as they figure in large scale genome-wide association studies

  • We emphasize the importance of enhancing data reproducibility, analysis reproducibility, and result replicability

Read more

Summary

Introduction

The US National Academy of Sciences, Engineering, and Medicine published a comprehensive report on Reproducibility and Replicability in Science (National Academy of Sciences, 2019). The European Commission published a scoping report on Reproducibility of Scientific Results in the EU (European Commission, 2020) These two reports remind us of the importance of reproducibility and replicability to the task of ensuring the validity of a new scientific discovery and trust in science. With a view toward addressing this pressing problem, we seek to share the lessons that we have learned about enhancing reproducibility and replicability in large scale Genome-Wide Association Studies (GWAS), and to make a few recommendations as well. We hope that these lessons are useful for advancing reproducible and replicable science in emerging studies of whole genome sequencing and biobanks, as well as in other disciplines. We emphasize the importance of engaging the scientific community in collaboratively developing a culture centered around the practices of 1) validating and standardizing data generation, data processing, and protocol development; 2) testing and standardizing open sourcing analysis pipelines and software; 3) building and supporting infrastructure and repositories to allow for convenient and safe data and resource sharing, and 4) engaging researchers, funding agencies, and journals in collective efforts aimed at improving data and resource sharing, with a view toward the larger aim of promoting reproducible and replicable science

Strategies for Enhancing Data Reproducibility
Roles of Study Design in Reproducible and Replicable Science
Enhancing Analysis Reproducibility and Result Replicability
Discussions and Recommendations
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.