Consider a random word $$X^n=(X_1,\ldots ,X_n)$$ in an alphabet consisting of 4 letters, with the letters viewed either as A, U, G and C (i.e., nucleotides in an RNA sequence) or $$\alpha $$ , $$\bar{\alpha }$$ , $$\beta $$ and $$\bar{\beta }$$ (i.e., generators of the free group $$\langle \alpha ,\beta \rangle $$ and their inverses). We show that the expected fraction $$\rho (n)$$ of unpaired bases in an optimal RNA secondary structure (with only Watson-Crick bonds and no pseudo-knots) converges to a constant $$\lambda _2$$ with $$0<\lambda _2<1$$ as $$n\rightarrow \infty $$ . Thus, a positive proportion of the bases of a random RNA string do not form hydrogen bonds. We do not know the exact value of $$\lambda _2$$ , but we derive upper and lower bounds for it. In terms of free groups, $$\rho (n)$$ is the ratio of the length of the shortest word representing X in the generating set consisting of conjugates of generators and their inverses to the word length of X with respect to the standard generators and their inverses. Thus for a typical word the word length in the (infinite) generating set consisting of the conjugates of standard generators grows linearly with the word length in the standard generators. In fact, we show that a similar result holds for all non-abelian finitely generated free groups $$\langle \alpha _1,\dots ,\alpha _k\rangle $$ , $$k\ge 2$$ .
Read full abstract