Abstract
BackgroundCodon usage analysis has been a classical topic for decades and has significances for studies of evolution, mRNA translation, and new gene discovery, etc. While the codon usage varies among different members of the plant kingdom, indicating the necessity for species-specific study, this work has mostly been limited to model organisms. Recently, the development of deep sequencing, especial RNA-Seq, has made it possible to carry out studies in non-model species.ResultRNA-Seq data of Chinese bayberry was analyzed to investigate the bias of codon usage and codon pairs. High frequency codons (AGG, GCU, AAG and GAU), as well as low frequency ones (NCG and NUA codons) were identified, and 397 high frequency codon pairs were observed. Meanwhile, 26 preferred and 141 avoided neighboring codon pairs were also identified, which showed more significant bias than the same pairs with one or more intervening codons. Codon patterns were also analyzed at the plant kingdom, organism and gene levels. Changes during plant evolution were evident using RSCU (relative synonymous codon usage), which was even more significant than GC3s (GC content of 3rd synonymous codons). Nine GO categories were differentially and independently influenced by CAI (codon adaptation index) or GC3s, especially in 'Molecular function’ category. Within a gene, the average CAI increased from 0.720 to 0.785 in the first 50 codons, and then more slowly thereafter. Furthermore, the preferred as well as avoided codons at the position just following the start codon AUG were identified and discussed in relation to the key positions in Kozak sequences.ConclusionA comprehensive codon usage Table and number of high-frequency codon pairs were established. Bias in codon usage as well as in neighboring codon pairs was observed, and the significance of this in avoiding DNA mutation, increasing protein production and regulating protein synthesis rate was proposed. Codon usage patterns at three levels were revealed and the significance in plant evolution analysis, gene function classification, and protein translation start site predication were discussed. This work promotes the study of codon biology, and provides some reference for analysis and comprehensive application of RNA-Seq data from other non-model species.
Highlights
Codon usage analysis has been a classical topic for decades and has significances for studies of evolution, mRNA translation, and new gene discovery, etc
Codon usage in Chinese bayberry Codon usage analysis in Chinese bayberry was based on 1,066 full-length Open reading frame (ORF) sequences after layers of filtering of 31,665 mRNAs, which were assembled from our previous RNA-Seq data
The overall GC content of 354,551 codons in the study is 0.477, but it varies in different codon positions, with the highest in GC1 (GC content of 1st nucleotide in codon, with value at 0.536), lowest in GC2 (GC content of 2nd nucleotide in codon, with value at 0.411), and intermediate in GC3 (GC content of 3rd nucleotide in codon, with value at 0.484), which is consistent with observations in other plants, such as citrus [20], apple, woodland strawberry, Arabidopsis thaliana, etc. (Additional file 2)
Summary
Codon usage analysis has been a classical topic for decades and has significances for studies of evolution, mRNA translation, and new gene discovery, etc. The development of deep sequencing, especial RNA-Seq, has made it possible to carry out studies in non-model species. The foundation of codon biology is based on the study of full length ORF (open reading frame) sequences from a range of species such as Caenorhabditis, Drosophila, Arabidopsis [3], Populus [7], apple [8] kiwifruit [9], and melon [10], which have been obtained mainly from EST technology in recent decades. Similar studies in non-model plants have been neglected, despite the existence of sequence data assembled from RNA-Seq. Further research and analysis in this area can aid the understanding of breeding of crops
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.