Abstract

BackgroundFirst pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events.ResultsA new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources.ConclusionsHGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users.

Highlights

  • First pass methods based on Basic Local Alignment Search Tool (BLAST) match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events

  • Given the rapid increase in available annotated genome data, and the associated computational challenge of analyzing such data, the BLAST best match method has remained a popular surrogate for first pass discovery analyses of gene histories that differ from the strict vertical pattern [16]

  • None of the negative control groups have an identifiable zero peak, which is equivalent to a vertical history for all genes

Read more

Summary

Introduction

First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. Given the rapid increase in available annotated genome data, and the associated computational challenge of analyzing such data, the BLAST best match method has remained a popular surrogate for first pass discovery analyses of gene histories that differ from the strict vertical pattern [16] This strategy is practiced by sorting BLAST hits by measures such as bit scores, an indicator of sequence similarity, and the best match organism represented by the top hit is identified for each gene [17]. Examples of programs featuring this approach include Pyphy [25], PhyloGenie [26], NGIBWS [27], and DarkHorse [28,29], the latter employs a user-definable filter threshold in combination with taxonomic scaling

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.