Abstract

Multiple sequence alignment (MSA) methods infer the homologous regions within DNA and protein sequences. Among the other algorithms for MSA, the center star method (CSM) is useful when processing a large number of sequences. In this method, the partially conserved regions among the non-center sequences tend to hide in the alignment. Therefore, existence of subsets of similar sequences in the input set leads to a significant accuracy loss. As a solution to the above problem, this research introduces an algorithm for MSA based on a solution proposed by D. Gusfield in 1991. In this research, the sequences are first grouped into subsets, aligned separately using CSM and finally these alignments are merged by applying progressive alignment procedure. The detailed solution for this problem is presented in the following section.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call