Abstract

Evolutionary relationships among organisms are commonly described by using a hierarchy derived from comparisons of ribosomal RNA (rRNA) sequences. We propose that even on the level of a single rRNA molecule, an organism's evolution is composed of multiple pathways due to concurrent forces that act independently upon different rRNA degrees of freedom. Relationships among organisms are then compositions of coexisting pathway-dependent similarities and dissimilarities, which cannot be described by a single hierarchy. We computationally test this hypothesis in comparative analyses of 16S and 23S rRNA sequence alignments by using a tensor decomposition, i.e., a framework for modeling composite data. Each alignment is encoded in a cuboid, i.e., a third-order tensor, where nucleotides, positions and organisms, each represent a degree of freedom. A tensor mode-1 higher-order singular value decomposition (HOSVD) is formulated such that it separates each cuboid into combinations of patterns of nucleotide frequency variation across organisms and positions, i.e., “eigenpositions” and corresponding nucleotide-specific segments of “eigenorganisms,” respectively, independent of a-priori knowledge of the taxonomic groups or rRNA structures. We find, in support of our hypothesis that, first, the significant eigenpositions reveal multiple similarities and dissimilarities among the taxonomic groups. Second, the corresponding eigenorganisms identify insertions or deletions of nucleotides exclusively conserved within the corresponding groups, that map out entire substructures and are enriched in adenosines, unpaired in the rRNA secondary structure, that participate in tertiary structure interactions. This demonstrates that structural motifs involved in rRNA folding and function are evolutionary degrees of freedom. Third, two previously unknown coexisting subgenic relationships between Microsporidia and Archaea are revealed in both the 16S and 23S rRNA alignments, a convergence and a divergence, conferred by insertions and deletions of these motifs, which cannot be described by a single hierarchy. This shows that mode-1 HOSVD modeling of rRNA alignments might be used to computationally predict evolutionary mechanisms.

Highlights

  • The ribosomal RNA is an essential component of the ribosome, the cellular organelle that associates the cell’s genotype with its phenotype by catalyzing protein synthesis in all known organisms, and underlies cellular evolution

  • The significant eigenpositions and corresponding nucleotide-specific segments of eigenorganisms represent multiple subgenic evolutionary relationships of convergence and divergence and correlations with structural motifs, some known and some previously unknown, that are consistent with current biological understanding of the 16S and 23S ribosomal RNA (rRNA)

  • We find the positions with largest nucleotide frequency increase in the A segment of the second 16S eigenorganism to be enriched in unpaired adenosines, which are exclusively conserved in the Eukarya excluding the Microsporidia (Figures S7 and S8 in Appendix S1), whereas the positions with largest decrease include all 50 unpaired adenosines exclusively conserved in the Bacteria (Figure 4)

Read more

Summary

Introduction

The ribosomal RNA (rRNA) is an essential component of the ribosome, the cellular organelle that associates the cell’s genotype with its phenotype by catalyzing protein synthesis in all known organisms, and underlies cellular evolution. RNAs are thought to be among the most primordial macromolecules. This is because an RNA template, similar to a DNA template, can be used to synthesize DNA and RNA, while RNA, similar to proteins, can form three-dimensional structures and catalyze reactions. It was suggested, that rRNA sequences and structures, that are similar or dissimilar among groups of organisms, are indicative of the relative evolutionary pathways of these organisms [1,2,3]. Advances in sequencing technologies have resulted in an abundance of rRNA sequences from organisms spanning all taxonomic groups. The small subunit ribosomal RNA (16S rRNA) is the gene with the largest number of determined sequences

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call