Abstract

The research on nuclear gene codon composition, usage pattern, and influencing factors in soybean can provide theoretical basis for applying genetic engineering techonology to improve soybean [Glycine max (L.) Merr.] varieties. In this paper, a total of 46 430 high confidence predicted coding sequences obtained from soybean genome database and 2071 full-length transcripts obtained from cDNA libraries were used for analyzing the composition and characteristics of soybean nuclear gene codons. The nucleotide composition, relative synonymous codon usage, and other parameters of soybean genome and full-length transcripts were calculated using CondonW software. The results showed that gene expression levels were significantly and positively correlated with the contents of G+C and GC3s, and genes with high G+C and GC3s contents had high codon preference. UCC and GCC were identified as optimal codons in soybean. Analysis of coding sequences in different lengths showed that codon preference reduced as the coding sequence (CDS) length increased, and longer CDS tended to select codons randomly. The CDS with 400 to 600 bp in length had the highest expression level according to the full-length transcripts data. The codon preference and expression level were almost identical between leaf-specific and seed-specific genes. However, seed-specific genes had significantly higher G+C and GC3s contents than leaf-specific genes, and the contents of aromatic amino acids encoded by seed-specific genes were significantly lower than that encoded by leaf-specific genes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call