Abstract

Multiple Sequence Alignment (MSA for short) is a well known problem in the field of computational biology. In order to evaluate the quality of a solution, many different scoring functions have been introduced, the most widely used being the Sum-of-Pairs score (SP-score). It is known that computing the best MSA under the SP-score measure is NP-hard.In this paper, we introduce a variant of the Column score (defined in Thompson et al. 1999), which we refer to as Selective Column score: Given a symbol a∈Σ, the score of the i-th column is one if and only if all symbols of the same column are a, and otherwise zero. The a-column score of an alignment is then the number of columns made of only character a.We show that finding the optimal MSA under the Selective Column Score is NP-hard for all alphabets of size |Σ|≥2, and that the associated maximization problem is poly-APX-hard.We also give an approximation algorithm that almost matches the inapproximability bound.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.