Abstract

BackgroundThe Codon Adaptation Index (CAI) is a measure of the synonymous codon usage bias for a DNA or RNA sequence. It quantifies the similarity between the synonymous codon usage of a gene and the synonymous codon frequency of a reference set. Extreme values in the nucleotide or in the amino acid composition have a large impact on differential preference for synonymous codons. It is thence essential to define the limits for the expected value of CAI on the basis of sequence composition in order to properly interpret the CAI and provide statistical support to CAI analyses. Though several freely available programs calculate the CAI for a given DNA sequence, none of them corrects for compositional biases or provides confidence intervals for CAI values.ResultsThe E-CAI server, available at , is a web-application that calculates an expected value of CAI for a set of query sequences by generating random sequences with G+C and amino acid content similar to those of the input. An executable file, a tutorial, a Frequently Asked Questions (FAQ) section and several examples are also available. To exemplify the use of the E-CAI server, we have analysed the codon adaptation of human mitochondrial genes that codify a subunit of the mitochondrial respiratory chain (excluding those genes that lack a prokaryotic orthologue) and are encoded in the nuclear genome. It is assumed that these genes were transferred from the proto-mitochondrial to the nuclear genome and that its codon usage was then ameliorated.ConclusionThe E-CAI server provides a direct threshold value for discerning whether the differences in CAI are statistically significant or whether they are merely artifacts that arise from internal biases in the G+C composition and/or amino acid composition of the query sequences.

Highlights

  • The Codon Adaptation Index (CAI) is a measure of the synonymous codon usage bias for a DNA or RNA sequence

  • It is essential to define a threshold level for the expected CAI value in order to interpret the significance of codon usage biases and to provide statistical support to CAI analyses

  • The expected CAI value (eCAI) estimated by our server makes it possible to discern whether differences in the CAI are statistically significant or whether they cannot be distinguished from biases due to nucleotide or amino acid composition

Read more

Summary

Results

Example: The Amelioration of mitochondrial genes encoded in the human nuclear genome It is widely accepted that mitochondria have their origin in a single event, arising from a bacterial symbiont whose closest contemporary relatives are found within the alfaproteobacteria [23,24]. Within nuclear-encoded mitochondrial, 34 out of 37 genes show a CAIhm above the expected upper limit at a 95% confidence level and 99% coverage, whereas only two genes have a CAImt above the expected upper limit at a 95% confidence level and 99% of coverage (Table 1a) We interpret this result so that the codon usage of the genes originally encoded in the proto-mitochondria and that are encoded in the human nuclear genome has been ameliorated and adapted to the human codon usage after their transfer to the nucleus. The normalised CAIhm is very different in both populations (figure 1), as is demonstrated if a Kolmogorov-Smirnoff test (D = 1.0, P < 0.0001) is used This clearly shows that the codon usage of the nuclear encoded genes is due to mutational pressure or G+C content, and that a certain degree of codon usage adaptation exists. It has recently been reported that a weak positive correlation between gene expression levels and the frequency of optimal codons exists in humans [30,31]

Conclusion
Background
13. Morton BR
18. Fitch WM
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call