Abstract

In the modern genomic era, scientists without extensive bioinformatic training need to apply high-power computational analyses to critical tasks like phage genome annotation. At the Center for Phage Technology (CPT), we developed a suite of phage-oriented tools housed in open, user-friendly web-based interfaces. A Galaxy platform conducts computationally intensive analyses and Apollo, a collaborative genome annotation editor, visualizes the results of these analyses. The collection includes open source applications such as the BLAST+ suite, InterProScan, and several gene callers, as well as unique tools developed at the CPT that allow maximum user flexibility. We describe in detail programs for finding Shine-Dalgarno sequences, resources used for confident identification of lysis genes such as spanins, and methods used for identifying interrupted genes that contain frameshifts or introns. At the CPT, genome annotation is separated into two robust segments that are facilitated through the automated execution of many tools chained together in an operation called a workflow. First, the structural annotation workflow results in gene and other feature calls. This is followed by a functional annotation workflow that combines sequence comparisons and conserved domain searching, which is contextualized to allow integrated evidence assessment in functional prediction. Finally, we describe a workflow used for comparative genomics. Using this multi-purpose platform enables researchers to easily and accurately annotate an entire phage genome. The portal can be accessed at https://cpt.tamu.edu/galaxy-pub with accompanying user training material.

Highlights

  • Bacteriophage, or phage, are the viruses of bacteria

  • The service offered with deposit to Genbank, the Prokaryotic Genome Annotation Pipeline (PGAP), is available as a stand-alone program for bacterial and archaeal genomes and used for all RefSeq sequences [67,68]

  • The powerful suite of tools housed in the Center for Phage Technology (CPT) Galaxy instance provides a dynamic, scalable framework for genome data analyses that can be used independently of, or in conjunction with the community genome editor, Apollo

Read more

Summary

Introduction

Bacteriophage, or phage, are the viruses of bacteria. Their study cracked open critical concepts in genetics, and allowed detailed gene mapping before genome maps could be generated with ease [1,2]. While phage genomes were the first to be sequenced in their entirety, phage research declined considerably before sequencing technologies took off. Researchers from disparate fields in the modern age have come to a new appreciation for the potential that phage have to help solve current problems, as well as the commensurate challenges facing their application [3]. One stated intent is to establish organized repositories, or phage banks, as a community resource for their distribution [4,5]. Coupling this with a surge in use of phage for education and research training, an incredible boom of phage sequencing has emerged, with great promise for extending our understanding of fundamental phage biology [6]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call