Abstract

With increased production of genomic data since the advent of next-generation sequencing (NGS), there has been a need to develop new bioinformatics tools and areas, such as comparative genomics. In comparative genomics, the genetic material of an organism is directly compared to that of another organism to better understand biological species. Moreover, the exponentially growing number of deposited prokaryote genomes has enabled the investigation of several genomic characteristics that are intrinsic to certain species. Thus, a new approach to comparative genomics, termed pan-genomics, was developed. In pan-genomics, various organisms of the same species or genus are compared. Currently, there are many tools that can perform pan-genomic analyses, such as PGAP (Pan-Genome Analysis Pipeline), Panseq (Pan-Genome Sequence Analysis Program) and PGAT (Prokaryotic Genome Analysis Tool). Among these software tools, PGAP was developed in the Perl scripting language and its reliance on UNIX platform terminals and its requirement for an extensive parameterized command line can become a problem for users without previous computational knowledge. Thus, the aim of this study was to develop a web application, known as PanWeb, that serves as a graphical interface for PGAP. In addition, using the output files of the PGAP pipeline, the application generates graphics using custom-developed scripts in the R programming language. PanWeb is freely available at http://www.computationalbiology.ufpa.br/panweb.

Highlights

  • Next-generation sequencing (NGS) platforms have made major advances in DNA sequencing methods, mainly due to their increased yield and accuracy and their significantly reduced cost [1,2]

  • PanWeb has shown to be efficient for a pan-genomic analysis and its user friendly interface presents the results in the form of graphs or tables, unlike the other tools available for pan-genomic analysis, such as Panseq [11] and PGAT [13], which is useful to users without computation skills

  • The analysis page consists of two input types, an upload form to receive the input files in EMBL or GBK format and if the user wants to perform pan-genomic analysis with microorganisms that have COG classification, there is a list of available microorganisms In addition, there are other fields to define parameters, such as e-value, identity, coverage and type of analysis to be performed (Pan-genome profile analysis, Genetic variation analysis, Species evolution analysis or Function enrichment analysis)

Read more

Summary

Introduction

Next-generation sequencing (NGS) platforms have made major advances in DNA sequencing methods, mainly due to their increased yield and accuracy and their significantly reduced cost [1,2]. Due to NGS technologies, there has been an exponential increase in the number of complete genomes deposited into public databases such as the Online Genome Database (https://gold.jgi.doe.gov/). With the large number of available genomes, especially prokaryotic genomes, it has become possible to perform comparative analyses, such as pan-genomic analyses [3,4]. Comparisons of the genetic repertoires of organisms can aid in the discovery of genes of biotechnological, biomedical and environmental interest [5].

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call