Abstract

We study model embeddability, which is a variation of the famous embedding problem in probability theory, when apart from the requirement that the Markov matrix is the matrix exponential of a rate matrix, we additionally ask that the rate matrix follows the model structure. We provide a characterisation of model embeddable Markov matrices corresponding to symmetric group-based phylogenetic models. In particular, we provide necessary and sufficient conditions in terms of the eigenvalues of symmetric group-based matrices. To showcase our main result on model embeddability, we provide an application to hachimoji models, which are eight-state models for synthetic DNA. Moreover, our main result on model embeddability enables us to compute the volume of the set of model embeddable Markov matrices relative to the volume of other relevant sets of Markov matrices within the model.

Highlights

  • The embedding problem for stochastic matrices, known as Markov matrices, deals with the question of deciding whether a stochastic matrix M is the matrix exponential of a rate matrix Q

  • The embedding problem for 2×2 matrices is due to Kendall and first published by Kingman (1962), for 3 × 3 matrices is fully settled in a series of papers Carette (1995), Chen and Chen (2011), Cuthbert (1973), Israel et al (2001) and Johansen (1974), while for 4 × 4 stochastic matrices has been recently solved in Casanellas et al (2020b)

  • In the present paper we study a refinement of the classical embedding problem, called the model embedding problem for a class of n × n stochastic matrices coming from phylogenetic models

Read more

Summary

Introduction

The embedding problem for stochastic matrices, known as Markov matrices, deals with the question of deciding whether a stochastic matrix M is the matrix exponential of a rate matrix Q. Algebraic and geometric methods have been employed with great success in the study of phylogenetic models leading to an explosion of related research work and the establishment of the field of phylogenetic algebraic geometry, known as algebraic phylogenetics; see Allman and Rhodes (2003), Baños et al (2016), Casanellas and Fernández-Sánchez (2007), Cavender and Felsenstein (1987), Gross and Long (2018), Evans and Speed (1993), Lake (1987), Pachter and Sturmfels (2004) and Sturmfels and Sullivant (2005) for a non-exhaustive list of publications To build such a phylogenetic model, we first require a phylogenetic tree T , which is a directed acyclic graph comprising of vertices and edges representing the evolutionary relationships of a group of species. A study of the set of embeddable and model-embeddable matrices corresponding to the Jukes–Cantor, Kimura-2 and Kimura-3 DNA substitution models, which are all symmetric group-based models, is undertaken in Casanellas et al (2020a) and RocaLacostena and Fernández-Sánchez (2018). The code for the computations in this paper is available at https:// github.com/ardiyam1/Model-Embeddability-for-Symmetric-Group-Based-Models

Preliminaries
33 Page 6 of 26
G-compatible labeling functions
33 Page 8 of 26
Z7 8 Z8
Model embeddability
33 Page 14 of 26
Hachimoji DNA
Hachimoji 7-parameter model
33 Page 16 of 26
Hachimoji 3-parameter model
Hachimoji 1-parameter model
33 Page 18 of 26
Volume
33 Page 20 of 26
Conclusion
A Friendly labeling functions
33 Page 24 of 26
Findings
33 Page 26 of 26
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.