Abstract

The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat – a freely available R package that provides a variety of statistical methods for the identification of phenotypic associations. The methods have been developed for high throughput phenotyping pipelines implemented across various experimental designs with an emphasis on managing temporal variation. PhenStat is targeted to two user groups: small-scale users who wish to interact and test data from large resources and large-scale users who require an automated statistical analysis pipeline. The software provides guidance to the user for selecting appropriate analysis methods based on the dataset and is designed to allow for additions and modifications as needed. The package was tested on mouse and rat data and is used by the International Mouse Phenotyping Consortium (IMPC). By providing raw data and the version of PhenStat used, resources like the IMPC give users the ability to replicate and explore results within their own computing environment.

Highlights

  • Irreproducibility of animal research is slowing advancement in understanding disease mechanisms, squandering resources on unproductive avenues of research and contributing to the cost of development of new drugs [1]

  • Applying the appropriate statistical analysis is a challenge in assessing biological data [28,29,30] and is an area of active research for high throughput phenotyping [10,20]

  • There is a need for accessible, freely available statistical tools that support the community in choosing the best analysis, especially when complex statistical methods are involved

Read more

Summary

Introduction

Irreproducibility of animal research is slowing advancement in understanding disease mechanisms, squandering resources on unproductive avenues of research and contributing to the cost of development of new drugs [1]. In large-scale model organism screens, a suite of statistical tests is required to accurately associate the interaction between genotype and phenotype. High-throughput methods ensure large volumes of phenotype data continue to be collected, an automated statistical method selection process and analysis platform is required. We have developed PhenStat, an R package of tools for the identification of phenotypic associations with an emphasis on statistical tools for high-throughput experiments that is made freely available from the Bioconductor repository. The PhenStat package has been tested and demonstrated with an application of 420 lines of mouse phenotyping data from the http://www.sanger.ac.uk/mouseportal/ Sanger Mouse Genetics Project [15] and http://www.eumodic.org/ EUMODIC project [16] and on rat phenotyping datasets from PhysGen resource (http://pga.mcw.edu) [17]. The usage of PhenStat enables these analyses to be automated and version controlled

Methods
Statistical methods available
Future work
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call