Abstract

Pangenome analysis is fundamental to explore molecular evolution occurring in bacterial populations. Despite the extensive availability of software for pangenome reconstruction, there is a lack of tools that allow customization of downstream data analysis in a simple and standardized way. To fill this gap, we introduce Pagoo, a new framework that enables straightforward handling of pangenome data. The encapsulated nature of Pagoo allows the storage of complex molecular and phenotypic information using an object-oriented approach that includes built-in methods, allowing the user going back and forward to the data using a single programming environment. Also, this design allows to save any analysis along with the unaltered pangenome data in a single file, making it sharable and reproducible. Pagoo acts as an API between the pangenome data and the user through the R console, providing tools to query, subset, compare, visualize and perform statistical analyses, which can be also used in concert with other microbial genomics packages available in the R ecosystem. As working examples, we used 1,000 Escherichia coli genomes to show that Pagoo is scalable, and a global dataset of Campylobacter fetus genomes to identify evolutionary patterns and genomic markers of host-adaptation in this pathogen. Pagoo represents an integrative and extensible approach to make pangenome analysis simple and reproducible.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call