Untranslated regions (UTRs) in eukaryotes play a significant role in the regulation of translation and mRNA half-life, as well as interacting with specific RNA-binding proteins. However, UTRs receive less attention than more crucial elements such as genes, and the basic structural and evolutionary characteristics of UTRs of different species, and the relationship between these UTRs and the genome size and species gene number is not well understood. To address these questions, we performed a comparative analysis of 5′ and 3′ untranslated regions of different species by analyzing the basic characteristics of 244,976 UTRs from three eukaryote kingdoms (Plantae, Fungi, and Protista). The results showed that the UTR lengths and SSR frequencies in UTRs increased significantly with increasing species gene number while the length and G+C content in 5′ UTRs and different types of repetitive sequences in 3′ UTRs increased with the increase of genome size. We also found that the sequence length of 5′ UTRs was significantly positively correlated with the presence of transposons and SSRs while the sequence length of 3′ UTRs was significantly positively correlated with the presence of tandem repeat sequences. These results suggested that evolution of species complexity from lower organisms to higher organisms is accompanied by an increase in the regulatory complexity of UTRs, mediated by increasing UTR length, increasing G+C content of 5′ UTRs, and insertion and expansion of repetitive sequences.
Read full abstract