Abstract

The Composition Vector method is a type of alignment-free methods for sequence comparison. The proposed method is based on modified k-string method, which uses the ratio of frequencies of all possible sub-words of length k in a DNA sequence to compare two sequences. We have proposed a scheme based on modified formulas for sequence comparison considering principle of maximum entropy. There exist several formulas for the purpose however the one maximizing the entropy was selected for the study. It leads to a unified approach for sequence comparison. The obtained results have been analyzed and compared with existing composition vector and K-string methods by drawing phylogenetic trees. The results show that the proposed scheme performs better in comparison to existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call