Abstract
BackgroundThe relationships between bacterial genomes are complicated by rampant horizontal gene transfer, varied selection pressures, acquisition of new genes, loss of genes, and divergence of genes, even in closely related lineages. As more and more bacterial genomes are sequenced, organizing and interpreting the incredible amount of relational information that connects them becomes increasingly difficult.ResultsWe have developed CodaChrome (http://www.sourceforge.com/p/codachrome), a one-versus-all proteome comparison tool that allows the user to visually investigate the relationship between a bacterial proteome of interest and the proteomes encoded by every other bacterial genome recorded in GenBank in a massive interactive heat map. This tool has allowed us to rapidly identify the most highly conserved proteins encoded in the bacterial pan-genome, fast-clock genes useful for subtyping of bacterial species, the evolutionary history of an indel in the Sphingobium lineage, and an example of horizontal gene transfer from a member of the genus Enterococcus to a recent ancestor of Helicobacter pylori.ConclusionCodaChrome is a user-friendly and powerful tool for simultaneously visualizing relationships between thousands of proteomes.
Highlights
The relationships between bacterial genomes are complicated by rampant horizontal gene transfer, varied selection pressures, acquisition of new genes, loss of genes, and divergence of genes, even in closely related lineages
Visualization of the CodaChrome matrix file The data contained in the CodaChrome Matrix File can be visualized using the CodaChrome graphical user interface (GUI) (Figure 1). This GUI is programmed in C++/QT and can be compiled to run on most common platforms. It renders the CodaChrome matrix file into a heat map image in which each row corresponds to a replicon in GenBank, each column corresponds to a protein in the seed organism and each pixel is colored according to the percent identity between the two proteins it represents
When the proteome of the Firmicute Bacillus subtilus was used as the seed in CodaChrome and replicons were sorted by average percent identity to the entire seed proteome with a percent identity cutoff of 20%, we noticed a vertical stripe of red and pink standing out in a region far above the horizontal band of red and pink corresponding to proteomes closely related to that of B. subtilis (Figure 6a)
Summary
We have developed CodaChrome (www.sourceforge.com/p/codachrome), a one-versus-all proteome comparison tool that allows the user to visually investigate the relationship between a bacterial proteome of interest and the proteomes encoded by every other bacterial genome recorded in GenBank in a massive interactive heat map. This tool has allowed us to rapidly identify the most highly conserved proteins encoded in the bacterial pan-genome, fast-clock genes useful for subtyping of bacterial species, the evolutionary history of an indel in the Sphingobium lineage, and an example of horizontal gene transfer from a member of the genus Enterococcus to a recent ancestor of Helicobacter pylori
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.