Abstract

A string or sequence is a linear array of symbols that come from an alphabet. Due to unknown substitutions, insertions, and deletions of symbols, a sequence cannot be treated like a vector or a tuple of a fixed number of variables. The synthesis of an ensemble of sequences is a sequence of random elements that specify the probabilities of occurrence of the different symbols at the corresponding sites of the sequences. The synthesis is determined by a hierarchical sequence synthesis procedure (HSSP), which returns not only the taxonomic hierarchy of the whole ensemble of sequences but also the alignment and the synthesis of a group (a subset of the ensemble) of the sequences at each level of the hierarchy. The HSSP does not require the ensemble of sequences to be presented in the form of a tabulated array of data, the hierarchical information of the data, or the assumption of a stochastic process. The authors present the concept of sequence synthesis and the applicability of the HSSP as a supervised classification procedure as well as an unsupervised classification procedure.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.