Abstract

BackgroundComputational methods for structural gene annotation have propelled gene discovery but face certain drawbacks with regards to prokaryotic genome annotation. Identification of transcriptional start sites, demarcating overlapping gene boundaries, and identifying regulatory elements such as small RNA are not accurate using these approaches. In this study, we re-visit the structural annotation of Mannheimia haemolytica PHL213, a bovine respiratory disease pathogen. M. haemolytica is one of the causative agents of bovine respiratory disease that results in about $3 billion annual losses to the cattle industry. We used RNA-Seq and analyzed the data using freely-available computational methods and resources. The aim was to identify previously unannotated regions of the genome using RNA-Seq based expression profile to complement the existing annotation of this pathogen.ResultsUsing the Illumina Genome Analyzer, we generated 9,055,826 reads (average length ~76 bp) and aligned them to the reference genome using Bowtie. The transcribed regions were analyzed using SAMTOOLS and custom Perl scripts in conjunction with BLAST searches and available gene annotation information. The single nucleotide resolution map enabled the identification of 14 novel protein coding regions as well as 44 potential novel sRNA. The basal transcription profile revealed that 2,506 of the 2,837 annotated regions were expressed in vitro, at 95.25% coverage, representing all broad functional gene categories in the genome. The expression profile also helped identify 518 potential operon structures involving 1,086 co-expressed pairs. We also identified 11 proteins with mutated/alternate start codons.ConclusionsThe application of RNA-Seq based transcriptome profiling to structural gene annotation helped correct existing annotation errors and identify potential novel protein coding regions and sRNA. We used computational tools to predict regulatory elements such as promoters and terminators associated with the novel expressed regions for further characterization of these novel functional elements. Our study complements the existing structural annotation of Mannheimia haemolytica PHL213 based on experimental evidence. Given the role of sRNA in virulence gene regulation and stress response, potential novel sRNA described in this study can form the framework for future studies to determine the role of sRNA, if any, in M. haemolytica pathogenesis.

Highlights

  • Computational methods for structural gene annotation have propelled gene discovery but face certain drawbacks with regards to prokaryotic genome annotation

  • Read alignment to the M. haemolytica PHL213 genome The M. haemolytica PHL213 is a 2.6 Mb draft genome containing 2,837 annotated regions of which 2,695 are protein coding with a 40% G+C content [29]

  • Head-on comparison of RNA-Seq with microarrays has shown that RNA-Seq has negligible technical variability [30], making it possible to obtain a reliable estimate of gene expression without replicate analysis

Read more

Summary

Introduction

Computational methods for structural gene annotation have propelled gene discovery but face certain drawbacks with regards to prokaryotic genome annotation. Computational methods for prokaryotic gene annotation such as Gene Locator and Interpolated Markov ModelER (GLIMMER) [2] and GeneMark.hmm [3] use hidden Markov models [4] based on a sequence similarity measure generated from previously annotated genomes. These algorithms do not accurately identify all genes in the genome and sometimes result in errors, especially in positioning of translational start codons [5] and in the identification of small protein coding genes. SRNA that regulate many biological processes, including virulence in bacterial pathogens, cannot be identified by computational approaches alone

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call