Abstract

BackgroundFused genes are important sources of data for studies of evolution and protein function. To date no service has been made available online to aid in the large-scale identification of fused genes in sequenced genomes. We have developed a program, Gene deFuser, that analyzes uploaded protein sequence files for characteristics of gene fusion events and presents the results in a convenient web interface.ResultsTo test the ability of this software to detect fusions on a genome-wide scale, we analyzed the 24,725 gene models predicted for the ciliated protozoan Tetrahymena thermophila. Gene deFuser detected members of eight of the nine families of gene fusions known or predicted in this species and identified nineteen new families of fused genes, each containing between one and twelve members. In addition to these genuine fusions, Gene deFuser also detected a particular type of gene misannotation, in which two independent genes were predicted as a single transcript by gene annotation tools. Twenty-nine of the artifacts detected by Gene deFuser in the initial annotation have been corrected in subsequent versions, with a total of 25 annotation artifacts (about 1/3 of the total fusions identified) remaining in the most recent annotation.ConclusionsThe newly identified Tetrahymena fusions belong to classes of genes involved in processes such as phospholipid synthesis, nuclear export, and surface antigen generation. These results highlight the potential of Gene deFuser to reveal a large number of novel fused genes in evolutionarily isolated organisms. Gene deFuser may also prove useful as an ancillary tool for detecting fusion artifacts during gene model annotation.

Highlights

  • Fused genes are important sources of data for studies of evolution and protein function

  • To test Gene deFuser’s ability to detect fused genes, we used it to analyze the genome of Tetrahymena thermophila, a ciliated protozoan evolutionarily distant from the seven eukaryotic species used to populate the KOG database

  • In addition to the evolutionary gene fusions we expected to find with this tool, we attempted to identify artificial gene fusions created during the process of gene model annotation, by comparing the earliest round of gene predictions with the most recent round

Read more

Summary

Introduction

Fused genes are important sources of data for studies of evolution and protein function. Very few of these recombination events produce proteins that retain their proper function or expression pattern, on occasion the constituent genes do combine to form a new, working gene that can be passed on to offspring [2]. While it has been hypothesized that two genes with unrelated functions may merge and be retained in the genome [4,5], almost all bifunctional fusion genes seen to date show a functional relationship between the proteins that comprise the fusion. Most fused gene pairs have orthologs that are part of the same metabolic pathway, are involved in the same protein complex [6], or regulate one another’s activity [5]. A selective advantage may emerge if the fused protein leads to a greater catalytic activity or more efficient co-regulation than is possible for the two independent proteins

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call