Abstract

In this paper, we combine two strategies to improve the final model representing a set of independent samples. We consider a set of independent samples coming from Markovian processes of finite order and finite alphabet. Under the assumption of the existence of a law that prevails in at least 50% of the samples of the collection, we identify samples governed by the predominant law (Fernandez et al (2019) Math Methods Appl Sci. https://doi.org/10.1002/mma.5705). The approach is based on a local metric between samples, which tends to zero when we compare samples of identical law and tends to infinity when comparing samples with different laws. The local metric allows defining a criterion which takes arbitrarily large values when the previous assumption about the existence of a predominant law does not hold. By means of this procedure, we select the samples which will be used to establish a minimal Markov model from the whole set of samples (Garc ia and Gonzalez-Lopez (2017) Entropy 19(4):160). These procedures were applied to nine complete genomic sequences of Dengue virus type 3 (DENV3), from the outbreak occurred in Henan, China, in 2013. The final model of the Henan set was built using the most representative sequences. It can be described by fourteen basic units (parts). The states that compound each of these basic units share the same transition probability for any element of the genomic alphabet A = {a, c, g, t}. We note that six of these units show a predilection for choosing the element a as next element, while the other two units prefer the element g as the next element.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.