Abstract
Multiple sequence alignment (MSA) methods infer the homologous regions within DNA and protein sequences. Among the other algorithms for MSA, the center star method (CSM) is useful when processing a large number of sequences. In this method, the partially conserved regions among the non-center sequences tend to hide in the alignment. Therefore, existence of subsets of similar sequences in the input set leads to a significant accuracy loss. As a solution to the above problem, this research introduces an algorithm for MSA based on a solution proposed by D. Gusfield in 1991. In this research, the sequences are first grouped into subsets, aligned separately using CSM and finally these alignments are merged by applying progressive alignment procedure. The detailed solution for this problem is presented in the following section.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have