An en masse phenotype and function prediction system for Mus musculus

Murat Taşan,Weidong Tian,Frederick P Roth,Francis D Gibbons,Judith A Blake,David P Hill

doi:10.1186/gb-2008-9-s1-s8

Abstract

Background:Individual researchers are struggling to keep up with the accelerating emergence of high-throughput biological data, and to extract information that relates to their specific questions. Integration of accumulated evidence should permit researchers to form fewer - and more accurate - hypotheses for further study through experimentation.Results:Here a method previously used to predict Gene Ontology (GO) terms for Saccharomyces cerevisiae (Tian et al.: Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function. Genome Biol 2008, 9(Suppl 1):S7) is applied to predict GO terms and phenotypes for 21,603 Mus musculus genes, using a diverse collection of integrated data sources (including expression, interaction, and sequence-based data). This combined 'guilt-by-profiling' and 'guilt-by-association' approach optimizes the combination of two inference methodologies. Predictions at all levels of confidence are evaluated by examining genes not used in training, and top predictions are examined manually using available literature and knowledge base resources.Conclusion:We assigned a confidence score to each gene/term combination. The results provided high prediction performance, with nearly every GO term achieving greater than 40% precision at 1% recall. Among the 36 novel predictions for GO terms and 40 for phenotypes that were studied manually, >80% and >40%, respectively, were identified as accurate. We also illustrate that a combination of 'guilt-by-profiling' and 'guilt-by-association' outperforms either approach alone in their application to M. musculus.

Highlights

With the ever-increasing collection of high-throughput experimental techniques, data acquisition at the genomic scale has never occurred more rapidly
We present results from one of the nine modeling approaches submitted for the MouseFunc project, producing an updated set of genome-wide annotation predictions for approximately 3,000 Gene Ontology (GO) terms [16] for M. musculus
Expanded and more recent sets of mammalian phenotype and GO term annotations for the same mouse genes were acquired from Mouse Genome Informatics (MGI) [17]

Summary

Introduction

With the ever-increasing collection of high-throughput experimental techniques, data acquisition at the genomic scale has never occurred more rapidly. Comprehensive annotation systems are of paramount importance, as evidenced by the integration of a large number of data types in many model organism databases [1,2,3,4] Such databases are the researcher's starting point for informed hypothesis generation, making the daunting task of curating the source data for representation in model organism databases crucial to effective science. Recognizing this problem, curation systems are becoming increasingly reliant on computational approaches to assist in the annotation process. Integration of accumulated evidence should permit researchers to form fewer - and more accurate - hypotheses for further study through experimentation

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome Biology	Publication Date: Jan 1, 2008
Citations: 60	License type: cc-by

R Discovery Prime

R Discovery Prime

An en masse phenotype and function prediction system for Mus musculus

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome Biology

Lead the way for us

Similar Papers

Answering Gene Ontology terms to proteomics questions by supervised macro reading in Medline
Julien Gobeill ... Patrick Ruch
EMBnet.journal | VOL. 18
Julien Gobeill, et. al.Julien Gobeill ... Patrick Ruch
09 Nov 2012
EMBnet.journal | VOL. 18

Global analysis of gene function in yeast by quantitative phenotypic profiling
James A Brown ... Gavin Sherlock
Molecular Systems Biology | VOL. 2
James A Brown, et. al.James A Brown ... Gavin Sherlock
01 Jan 2006
Molecular Systems Biology | VOL. 2

Associating transcription factor-binding site motifs with target GO terms and target genes
Mikael Bodén ... Timothy L Bailey
Nucleic Acids Research | VOL. 36
Mikael Bodén, et. al.Mikael Bodén ... Timothy L Bailey
10 Jun 2008
Nucleic Acids Research | VOL. 36

FunPredCATH: An ensemble method for predicting protein function using CATH
Joseph Bonello ... Christine Orengo
BBA - Proteins and Proteomics | VOL. 1872
Joseph Bonello, et. al.Joseph Bonello ... Christine Orengo
19 Dec 2023
BBA - Proteins and Proteomics | VOL. 1872

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An en masse phenotype and function prediction system for Mus musculus

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome Biology