Abstract

BackgroundAmino acid substitution models play an essential role in inferring phylogenies from mitochondrial protein data. However, only few empirical models have been estimated from restricted mitochondrial protein data of a hundred species. The existing models are unlikely to represent appropriately the amino acid substitutions from hundred thousands metazoan mitochondrial protein sequences.ResultsWe selected 125,935 mitochondrial protein sequences from 34,448 species in the metazoan kingdom to estimate new amino acid substitution models targeting metazoa, vertebrates and invertebrate groups. The new models help to find significantly better likelihood phylogenies in comparison with the existing models. We noted remarkable distances from phylogenies with the existing models to the maximum likelihood phylogenies that indicate a considerable number of incorrect bipartitions in phylogenies with the existing models. Finally, we used the new models and mitochondrial protein data to certify that Testudines, Aves, and Crocodylia form one separated clade within amniotes.ConclusionsWe introduced new mitochondrial amino acid substitution models for metazoan mitochondrial proteins. The new models outperform the existing models in inferring phylogenies from metazoan mitochondrial protein data. We strongly recommend researchers to use the new models in analysing metazoan mitochondrial protein data.

Highlights

  • Amino acid substitution models play an essential role in inferring phylogenies from mitochondrial protein data

  • We summarised the experimental results to show the advantage of the new models in inferring the maximum likelihood phylogenies in comparison to existing mt models

  • We introduced three new mt models estimated from large mt protein datasets of metazoan, vertebrate, and invertebrate species

Read more

Summary

Introduction

Amino acid substitution models play an essential role in inferring phylogenies from mitochondrial protein data. The existing models are unlikely to represent appropriately the amino acid substitutions from hundred thousands metazoan mitochondrial protein sequences. An amino acid substitution model (model for short) includes a 20 × 20 matrix and an amino acid frequency vector. Amino acid substitution models are the key to infer phylogenies from protein data. Estimating amino acid substitution models is much more challenging than estimating nucleotide substitution models due to a large number of parameters to be optimised. The general time reversible model for nucleotides contains 8 parameters in comparing to 208 parameters for models of amino acid substitutions. Amino acid substitution models are typically estimated from large datasets

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call