Abstract

Parametric methods for identifying laterally transferred genes exploit the directional mutational biases unique to each genome. Yet the development of new, more robust methods—as well as the evaluation and proper implementation of existing methods—relies on an arbitrary assessment of performance using real genomes, where the evolutionary histories of genes are not known. We have used the framework of a generalized hidden Markov model to create artificial genomes modeled after genuine genomes. To model a genome, “core” genes—those displaying patterns of mutational biases shared among large numbers of genes—are identified by a novel gene clustering approach based on the Akaike information criterion. Gene models derived from multiple “core” gene clusters are used to generate an artificial genome that models the properties of a genuine genome. Chimeric artificial genomes—representing those having experienced lateral gene transfer—were created by combining genes from multiple artificial genomes, and the performance of the parametric methods for identifying “atypical” genes was assessed directly. We found that a hidden Markov model that included multiple gene models, each trained on sets of genes representing the range of genotypic variability within a genome, could produce artificial genomes that mimicked the properties of genuine genomes. Moreover, different methods for detecting foreign genes performed differently—i.e., they had different sets of strengths and weaknesses—when identifying atypical genes within chimeric artificial genomes.

Highlights

  • With the number of genome sequences accumulating at a rapid pace, evidence for rampant lateral gene transfer among prokaryotes has increased dramatically [1À4]

  • Protein-coding sequences were created by multiple, fifth-order, inhomogeneous Markov models; noncoding sequences were created by a homogeneous Markov model of noncoding sequence accounting for hexamer statistics

  • All gene sequences in a bacterial genome cannot be accurately described by a single model; the probabilistic nature of the hidden Markov model (HMM) would necessarily result in artificial genomes that failed to represent the variability among gene sequences seen in genuine genomes

Read more

Summary

Introduction

With the number of genome sequences accumulating at a rapid pace, evidence for rampant lateral gene transfer among prokaryotes has increased dramatically [1À4]. Significant advances have been made in understanding this evolutionary phenomenon, and current research is aimed at understanding the impact of gene transfer rather than at demonstrating its occurrence [5À8]. Inferences regarding the scope and impact of lateral gene transfer rely on the accurate and consistent identification of putative foreign genes, methods for objective, robust quantification of the lateral gene transfer have been difficult to devise. There has been no platform available to test the efficacy and performance of methods for the identification of foreign genes. Classification of genes as native or laterally transferred uses various sets of indirect evidence, and the scope and objectivity of each approach are debatable [9À13]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.