Abstract

The Composition Vector method is a type of alignment-free methods for sequence comparison. The proposed method is based on modified k-string method, which uses the ratio of frequencies of all possible sub-words of length k in a DNA sequence to compare two sequences. We have proposed a scheme based on modified formulas for sequence comparison considering principle of maximum entropy. There exist several formulas for the purpose however the one maximizing the entropy was selected for the study. It leads to a unified approach for sequence comparison. The obtained results have been analyzed and compared with existing composition vector and K-string methods by drawing phylogenetic trees. The results show that the proposed scheme performs better in comparison to existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.