Abstract

BackgroundOver the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria.ResultsIn this work, we introduce a ‘Genome Topology Network’ (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of ‘unfixed ortholog’ has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309).ConclusionsThe GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/, and allows re-annotating the ‘lost’ genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes were found to be affected by SVs in M. tuberculosis genomes. We believe that the GTN method will be suitable for the exploration of genomic SVs in connection with biological features of bacterial strains, and that GTN-based phylogenetic analysis will provide additional insights into whole genome-based phylogenetic analysis.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1259-0) contains supplementary material, which is available to authorized users.

Highlights

  • Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains

  • We propose the concept of ‘unfixed ortholog’, whose members are affected by structural variations (SVs) in genome topology among close species

  • The Genome Topology Network (GTN) method has been implemented in PERL and released on our website

Read more

Summary

Introduction

Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. One. As complete genome sequences become available, genome organization studies become attractive due to their applications to gene function predictions and phylogenetic. Gene order and gene content determined by ortholog identification have been well investigated in genome organization studies. Whole genome feature-based phylogenetic methods are urgently needed as more complete genome sequences become available. In contrast to phylogenetic studies based on a single or a few genes, studies over entire genomes involve more evolutionary variations and may reveal more confident phylogenetic relationships. Some genome feature-based methods have emerged, such as using gene content, gene order and the distribution of oligonucleotides (‘DNA strings’), in phylogenetic analyses. Methods based on whole genome single nucleotide polymorphisms (SNPs) are increasingly being used in phylogenetic analysis [4]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.