Abstract

Escherichia coli rho-independent transcription terminators are characterized by an RNA structure having a G+C-rich stem-loop followed by a series of uridine residues, but they can be only partially predicted by the stability of this structure or by its primary sequence. A large number of such terminators have been identified or proposed in the literature, and we have constituted a list of them (148 found in 1021 x 10(3) base-pairs of E. coli DNA sequences) in order to analyze statistically the corresponding RNA hairpins. We show that the size of the loops presents a narrow distribution, that their sequences are not random, and that most loops are closed by a C.G base-pair. In particular, 55% of the loops are tetranucleotides and the most abundant loop sequences are UUCG and GAAA. These loops are abundant in prokaryotic and eukaryotic RNAs, and are known to enhance the stability of RNA hairpins. We propose that these tetraloops play an important role in the nucleation of the nascent RNA structures, as does also the presence of a C.G base-pair closing a hairpin loop. This analysis allows us to propose a model of formation of an RNA hairpin during the termination process and to construct an algorithm of prediction of the terminators in a given DNA sequence. For the E. coli sequences, it clearly distinguishes inter- from intracistronic terminator-like structures, and selects 141 of the 148 rho-independent terminators given in the literature, with a very low background. It also predicts with reasonable accuracy the in vitro termination efficiency of known rho-independent terminators, as well as predicting the existence of 35 as yet uncharacterized terminators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call