Abstract

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

Highlights

  • Non-B DNA structures have been identified in eukaryotes as well as prokaryotes [1, 2]

  • In this report we focus on G-rich heptameric repeats of the type GGGAATC in the plant pathogens Xanthomonas campestris pv. campestris ATCC 33913 (Xcc) [45] and Xanthomonas axonopodis pv. citri str. 306 (Xac) [46]

  • We hereby noticed an intriguing overrepresentation of GGGAATC repeat patterns among the putative quadruplex patterns, which led us to screen the genomes of Xcc and the related species Xac for GGGAATC-containing tandem repeats

Read more

Summary

Introduction

Non-B DNA structures have been identified in eukaryotes as well as prokaryotes [1, 2]. Other examples of sequences that can give rise to noncanonical DNA structures include palindromes and close inverted repeats [7], simple sequence repeats (SSRs) [8, 9] as well as G-quadruplex (G4) forming sequences [10, 11]. Among these different structural elements mutagenic effects on DNA have been associated especially to SSRs [12]. In addition to genomic instability there is increasing evidence for non-canonical nucleic acid structures to directly or indirectly influence replication, recombination, transcription and translation on the DNA or RNA level [1, 10, 18,19,20,21,22,23]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call