Abstract

BackgroundThe completion of numerous genome sequences introduced an era of whole-genome study. However, many genes are missed during genome annotation, including small RNAs (sRNAs) and small open reading frames (sORFs). In order to improve genome annotation, we aimed to identify novel sRNAs and sORFs in Shigella, the principal etiologic agents of bacillary dysentery.Methodology/Principal FindingsWe identified 64 sRNAs in Shigella, which were experimentally validated in other bacteria based on sequence conservation. We employed computer-based and tiling array-based methods to search for sRNAs, followed by RT-PCR and northern blots, to identify nine sRNAs in Shigella flexneri strain 301 (Sf301) and 256 regions containing possible sRNA genes. We found 29 candidate sORFs using bioinformatic prediction, array hybridization and RT-PCR verification. We experimentally validated 557 (57.9%) DOOR operon predictions in the chromosomes of Sf301 and 46 (76.7%) in virulence plasmid.We found 40 additional co-expressed gene pairs that were not predicted by DOOR.Conclusions/SignificanceWe provide an updated and comprehensive annotation of the Shigella genome. Our study increased the expected numbers of sORFs and sRNAs, which will impact on future functional genomics and proteomics studies. Our method can be used for large scale reannotation of sRNAs and sORFs in any microbe with a known genome sequence.

Highlights

  • Genome sequence information has accumulated at a fast pace in recent years

  • A major problem is that many genes have been overlooked, including noncoding RNAs and small open reading frames (,100 amino acids; sORFs)

  • New bioinformatics and experimental strategies have identified a greater number of novel small RNAs (sRNAs) candidates in bacteria, including, Escherichia coli [7,8,9,10,11,12], Vibrio cholerae[13,14,15], Staphylococcus aureus[16], Clostridium perfringens [17,18], Chlamydia trachomatis[19], Pseudomonas aeruginosa[20,21], Bacillus subtilis[22,23], Listeria monocytogenes[24,25], Salmonella typhimurium[26,27,28], Streptococcus pyogenes[29], Streptococcus pneumoniae[30,31], Mybacterium tuberculosis[32], and many others

Read more

Summary

Introduction

Genome sequence information has accumulated at a fast pace in recent years. The generation of whole genome sequences creates new opportunities and resources for both basic and applied research. A major problem is that many genes have been overlooked, including noncoding RNAs (ncRNAs) and small open reading frames (,100 amino acids; sORFs). There has been considerable recent interest in ncRNAs, other than ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs), as important regulators in eukaryotes and prokaryotes[1,2,3,4,5] These RNAs are collectively referred to as small RNAs (sRNAs) in bacteria where they usually regulate gene expression by pairing with other RNAs as part of RNA-protein complexes, or adopt the structures of other nucleic acids [2,6]. Many genes are missed during genome annotation, including small RNAs (sRNAs) and small open reading frames (sORFs). In order to improve genome annotation, we aimed to identify novel sRNAs and sORFs in Shigella, the principal etiologic agents of bacillary dysentery

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call