Abstract

Multiple sequence alignment (MSA) is a preliminary task for estimating phylogenies. It is used for homology inference among the sequences of a set of species. Generally, the MSA task is handled as a single-objective optimization process. The alignments computed under one criterion may be different from the alignments generated by other criteria, inferring discordant homologies and thus leading to different hypothesized evolutionary histories relating the sequences. The multiobjective (MO) formulation of MSA has recently been advocated by several researchers, to address this issue. An MO approach independently optimizes multiple (often conflicting) objective functions at the same time and outputs a set of competitive alignments. However, no conceptual or experimental rational from a real-world application perspective has been reported so far for any MO formulation of MSA. This article work investigates the impact of MO formulation in the context of an important scientific problem, namely, phylogeny estimation. Employing popular evolutionary MO algorithms, we show that: 1) trees inferred based on alignments produced by the existing MSA methods used in practice are substantially worse in quality than the trees inferred based on the alignment's output by an MO algorithm and 2) even high-quality alignments (according to popular measures available in the literature) may fail to achieve acceptable accuracy in generating phylogenetic trees. Thus, we essentially ask the following natural question: "can a phylogeny-aware (i.e., application-aware) metric guide in selecting appropriate MO formulations to ensure better phylogeny estimation?" Here, we report a carefully designed extensive experimental study that positively answers this question.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call