Abstract

BackgroundCodon substitution probabilities are used in many types of molecular evolution studies such as determining Ka/Ks ratios, creating ancestral DNA sequences or aligning coding DNA. Until the recent dramatic increase in genomic data enabled construction of empirical matrices, researchers relied on parameterized models of codon evolution. Here we present the first empirical codon substitution matrix entirely built from alignments of coding sequences from vertebrate DNA and thus provide an alternative to parameterized models of codon evolution.ResultsA set of 17,502 alignments of orthologous sequences from five vertebrate genomes yielded 8.3 million aligned codons from which the number of substitutions between codons were counted. From this data, both a probability matrix and a matrix of similarity scores were computed. They are 64 × 64 matrices describing the substitutions between all codons. Substitutions from sense codons to stop codons are not considered, resulting in block diagonal matrices consisting of 61 × 61 entries for the sense codons and 3 × 3 entries for the stop codons.ConclusionThe amount of genomic data currently available allowed for the construction of an empirical codon substitution matrix. However, more sequence data is still needed to construct matrices from different subsets of DNA, specific to kingdoms, evolutionary distance or different amount of synonymous change. Codon mutation matrices have advantages for alignments up to medium evolutionary distances and for usages that require DNA such as ancestral reconstruction of DNA sequences and the calculation of Ka/Ks ratios.

Highlights

  • Codon substitution probabilities are used in many types of molecular evolution studies such as determining Ka/Ks ratios, creating ancestral DNA sequences or aligning coding DNA

  • Models for codon substitutions are used in computational biology for a wide range of applications such as reconstructing ancestral DNA sequences, determining Ka/Ks ratios to identify periods of adaptive evolution and aligning coding DNA

  • Synonymous mutations It has been observed that different genes have different Ka/Ks ratios and the fraction of synonymous substitutions will differ between different gene pairs having a certain PAM distance

Read more

Summary

Introduction

Codon substitution probabilities are used in many types of molecular evolution studies such as determining Ka/Ks ratios, creating ancestral DNA sequences or aligning coding DNA. Until the recent dramatic increase in genomic data enabled construction of empirical matrices, researchers relied on parameterized models of codon evolution. We present the first empirical codon substitution matrix entirely built from alignments of coding sequences from vertebrate DNA and provide an alternative to parameterized models of codon evolution. Models for codon substitutions are used in computational biology for a wide range of applications such as reconstructing ancestral DNA sequences, determining Ka/Ks ratios to identify periods of adaptive evolution and aligning coding DNA

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call