Abstract

BackgroundThe introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons.ResultsTo support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy.ConclusionEDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed.

Highlights

  • The introduction of generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes

  • Based on the core genome of 2,156 coding sequences (CDS) the divergence of these plant-pathogenic bacteria was quantified with the recently annotated X. campestris pv. campestris (Xcc) B100 employed as reference to construct the tree

  • As we demonstrated by the use case EDGAR provides various useful features for the comparative analysis of closely related genomes

Read more

Summary

Introduction

The introduction of generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is feasible to analyze large groups of related genomes in a comparative approach. The mid fifties produced a rather pragmatic definition of the term species, described as a group of cultures or strains which is accepted by bacteriologists as sufficiently closely related [1]. About thirty years later a more fundamental proposition of the term [2] considered measurable quantities including strains' DNA molecules reassociation values and phenotypic traits. In recent times, these classical approaches are likely to be outdated by future deductions which may be taken from the increasing collection of genomic information

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.