Abstract

Genotype-phenotype studies aim to identify causative relationships between genes and phenotypes. The International Mouse Phenotyping Consortium is a high throughput phenotyping program whose goal is to collect phenotype data for a knockout mouse strain of every protein coding gene. The scale of the project requires an automatic analysis pipeline to detect abnormal phenotypes, and disseminate the resulting gene-phenotype annotation data into public resources. A body weight phenotype is a common result of knockout studies. As body weight correlates with many other biological traits, this challenges the interpretation of related gene-phenotype associations. Co-correlation can lead to gene-phenotype associations that are potentially misleading. Here we use statistical modelling to account for body weight as a potential confounder to assess the impact. We find that there is a considerable impact on previously established gene-phenotype associations due to an increase in sensitivity as well as the confounding effect. We investigated the existing ontologies to represent this phenotypic information and we explored ways to ontologically represent the results of the influence of confounders on gene-phenotype associations. With the scale of data being disseminated within the high throughput programs and the range of downstream studies that utilise these data, it is critical to consider how we improve the quality of the disseminated data and provide a robust ontological representation.Electronic supplementary materialThe online version of this article (doi:10.1186/s13326-016-0050-8) contains supplementary material, which is available to authorized users.

Highlights

  • In genotype-phenotype studies, one approach to identify abnormal phenotypes is a statistical comparison of data collected from control and gene-altered animals

  • This highthroughput phenotyping is based on a pipeline concept where a mouse is characterised in a series of phenotype screens underpinned by standard operating procedures defined by the International Mouse Phenotyping Consortium (IMPC) in the International Mouse Phenotyping Resource of Standardised Screens

  • With interest growing in the impact of body weight on phenotypes and the scale of projects being conducted by high throughput phenotyping consortiums, being able to disseminate annotated phenotype data has become an important issue

Read more

Summary

Introduction

In genotype-phenotype studies, one approach to identify abnormal phenotypes is a statistical comparison of data collected from control and gene-altered animals. The goal of the IMPC is to produce and phenotypically characterise 20,000 knockout mouse strains in a reproducible manner across multiple research centres This highthroughput phenotyping is based on a pipeline concept where a mouse is characterised in a series of phenotype screens underpinned by standard operating procedures defined by the IMPC in the International Mouse Phenotyping Resource of Standardised Screens (IMPReSS) resource [2]. The variable “fasted blood glucose concentration” is associated to three MP terms: “abnormal-”, “increased-”, and “decreased-” “-fasted circulating glucose level” Using this approach, abnormal phenotypes identified via statistical analysis are summarised as gene-phenotype associations, understood by the biological community and facilitating dissemination to Oellrich et al Journal of Biomedical Semantics (2016) 7:2 the community (Fig. 1). Sharing these gene-phenotype annotations enables data mining across species and studies e.g. for disease gene candidate discovery, pharmacogenetics and evolutionary studies [5,6,7]

Objectives
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call