Abstract

Background: The analysis of DNA nucleotide sequence similarity among different species is crucial in identifying their functional, structural or evolutionary relationships. The number of bioinformatics tools designed to perform the similarity analysis of nucleotide sequences has been growing rapidly. According to the current literature, alignment-free methods have not been performed on repetitive nucleotide sequence of different lengths. Objective: To develop a new algorithm for determining sequence characteristics and similarity based on statistically significant repetitive elements of different lengths, which are located in analyzed sequences. Methods: This paper presents Repeats-Position/Frequency method (R-P/F method), for determining nucleotide sequence similarity which takes into consideration statistically significant repetitive parts of analyzed sequences. It is based on information theory and the fact that both position and frequency of repeated sequences are not expected to occur with the identical presence in a random sequence of the same length. Nucleotide sequences are presented in rn-dimensional vector space and their hierarchy is constructed by applying hierarchical clustering algorithm. Results: R-P/F method has been validated on multiple data sets of nucleotide sequences and compared with results obtained from alignment-based algorithms BLAST and Clustal Omega, and multiple wellestablished alignment-free dissimilarity measures. Presented method provides results comparable with other commonly used methods focused on resolving the same problem, with the novel view on the used repetitive parts of sequences in these calculations. Conclusion: The presented, novel algorithm for calculating sequence similarity measure is effective in discovering relationships among the sequences and makes a powerful and complementary addition to existing sequence similarity methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.