Denoising inferred functional association networks obtained by gene fusion analysis

Atanas Kamburov,Leon Goldovsky,Victor Kunin,Aliki Kapazoglou,Shiri Freilich,Christos A Ouzounis,Athanasios Tsaftaris,Anton J Enright

doi:10.1186/1471-2164-8-460

Abstract

BackgroundGene fusion detection – also known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes.ResultsIn order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions.ConclusionWe illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function.

Highlights

Gene fusion detection – known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes
We present the largest analysis, as yet undertaken, toward the computational detection of protein-protein interactions and functional associations based on gene fusion events
We illustrate that gene fusion events are widespread across the different domains of life and are present in intricate patterns across various genomes, both in terms of their phylogenetic distribution and in terms of the biological diversity of proteins involved in these events

Summary

Introduction

Gene fusion detection – known as the 'Rosetta Stone' method – involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes. The gene-fusion approach relies on the observation that pairs of genes encoding proteins of known function (usually interacting or forming a complex) tend to be found in other species as a fused composite gene encoding a single multifunctional protein This event had been previously noted in protein evolution but not explicitly used for the prediction of protein function [8]. We refer to the fused gene as a 'composite' protein, while the un-fused counterpart genes in the reference organism are referred to as 'component' proteins [1,3]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Dec 1, 2007
Citations: 39	License type: cc-by

R Discovery Prime

R Discovery Prime

Denoising inferred functional association networks obtained by gene fusion analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

FusionDB: a database for in-depth analysis of prokaryotic gene fusion events.
K Suhre
Nucleic acids research | VOL. 32
K SuhreK Suhre
01 Jan 2004
Nucleic acids research | VOL. 32

Detection of EML4-ALK and other ALK fusion genes in lung cancer: a lesson from the leukemia fusion gene analysis and future application.
Tae Sung Park ... Claus Meyer
Journal of Korean medical science | VOL. 27
Tae Sung Park, et. al.Tae Sung Park ... Claus Meyer
01 Jan 2012
Journal of Korean medical science | VOL. 27

Estimating Protein Function Using Protein-Protein Relationships
Shailesh V Date
-
Shailesh V DateShailesh V Date
01 Jan 2007
01 Jan 2007

Year 2 Report: Protein Function Prediction Platform
C Zhou
-
C ZhouC Zhou
27 Apr 2012
27 Apr 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Denoising inferred functional association networks obtained by gene fusion analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics