MiRNAs are small endogenous noncoding RNAs that inhibit protein expression at the translation level via complementary binding to a specific site in the mRNA. Most of the functional binding sites (miRNA nucleotides 2–7) located within the 3' untranslated region are conserved. These sites are predicted using the methods of comparative genomics, primarily, multiple sequence alignment. However, multiple sequence alignments are prone to accumulating errors due to strong divergence of species. Moreover, in the course of evolution, binding sites can migrate along the sequence. The aim of this work was to estimate the portion of conserved miRNA-binding sites that cannot be predicted using the existing tools because of these phenomena. The concept of L-conserved sites is introduced: a site is termed L-conserved if it is present in each sequence in the alignment within a frame of length L. We observed a significant increase in the number of additionally detected conserved sites without loss of sensitivity. The effect of species divergence on this increase was also evaluated.
Read full abstract