Abstract

Deciding whether a substitution matrix is embeddable (i.e. the corresponding Markov process has a continuous-time realization) is an open problem even for [Formula: see text] matrices. We study the embedding problem and rate identifiability for the K80 model of nucleotide substitution. For these [Formula: see text] matrices, we fully characterize the set of embeddable K80 Markov matrices and the set of embeddable matrices for which rates are identifiable. In particular, we describe an open subset of embeddable matrices with non-identifiable rates. This set contains matrices with positive eigenvalues and also diagonal largest in column matrices, which might lead to consequences in parameter estimation in phylogenetics. Finally, we compute the relative volumes of embeddable K80 matrices and of embeddable matrices with identifiable rates. This study concludes the embedding problem for the more general model K81 and its submodels, which had been initiated by the last two authors in a separate work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call