Abstract

BackGroundBacterial genomes are being deposited into online databases at an increasing rate. Genome annotation represents one of the first efforts to understand organisms and their diseases. Some evolutionary relationships capable of being annotated only from genomes are conserved gene neighbourhoods (CNs), phylogenetic profiles (PPs), and gene fusions. At present, there is no standalone software that enables networks of interactions among proteins to be created using these three evolutionary characteristics with efficient and effective results.ResultsWe developed GENPPI software for the ab initio prediction of interaction networks using predicted proteins from a genome. In our case study, we employed 50 genomes of the genus Corynebacterium. Based on the PP relationship, GENPPI differentiated genomes between the ovis and equi biovars of the species Corynebacterium pseudotuberculosis and created groups among the other species analysed. If we inspected only the CN relationship, we could not entirely separate biovars, only species. Our software GENPPI was determined to be efficient because, for example, it creates interaction networks from the central genomes of 50 species/lineages with an average size of 2200 genes in less than 40 min on a conventional computer. Moreover, the interaction networks that our software creates reflect correct evolutionary relationships between species, which we confirmed with average nucleotide identity analyses. Additionally, this software enables the user to define how he or she intends to explore the PP and CN characteristics through various parameters, enabling the creation of customized interaction networks. For instance, users can set parameters regarding the genus, metagenome, or pangenome. In addition to the parameterization of GENPPI, it is also the user’s choice regarding which set of genomes they are going to study.ConclusionsGENPPI can help fill the gap concerning the considerable number of novel genomes assembled monthly and our ability to process interaction networks considering the noncore genes for all completed genome versions. With GENPPI, a user dictates how many and how evolutionarily correlated the genomes answer a scientific query.

Highlights

  • The annotation of genomes is an important task to perform after sequencing and assembly

  • GENPPI can help fill the gap concerning the considerable number of novel genomes assembled monthly and our ability to process interaction networks considering the noncore genes for all completed genome versions

  • An analysis of the data from the heat maps of our work indicates that the genome named C. rouxii was C. diphtheriae

Read more

Summary

Introduction

The annotation of genomes is an important task to perform after sequencing and assembly. If we consider ORFs to be vertices and the relationships as edges, a complex network can be constructed from a genome. STRING presents annotation data for more than five thousand genomes spread over a wide range of organisms Such features as conserved gene neighbourhood, conserved phylogenetic profile, gene fusion, Gene Ontology features (molecular function, process, and localization), coexpression, experiments, and bibliographic evidence are conjugated, creating a probabilistic strength of belief of interaction for pairs of proteins [5]. We knew that at least 10% of predicted genes from a recently elucidated genome are not present in previously annotated genomes [6] This property implies that in a newly characterized Escherichia coli lineage, at least five hundred genes will not receive a single annotation if topological annotations based on sequence similarity (TABSS) are utilized. We propose a new bioinformatic tool, named GENPPI, that is capable of processing a set of genomes stored in a conventional configuration machine

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.