Abstract

Classification is a technique that labels subjects based on the characteristics of the data. It often includes using prior learned information from preexisting data drawn from the same distribution or data type to make informed decisions per each given subject. The method presented here, the Characteristic Attribute Organization System (CAOS), uses a character-based approach to molecular sequence classification. Using a set of aligned sequences (either nucleotide or amino acid) and a maximum parsimony tree, CAOS will generate classification rules for the sequences based on tree structure and provide more interpretable results than other classification or sequence analysis protocols. The code is accessible at https://github.com/JuliaHealth/CAOS.jl/ .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call