Abstract

Word-based or ‘alignment-free’ methods for phylogeny inference have become popular in recent years. These methods are much faster than traditional, alignment-based approaches, but they are generally less accurate. Most alignment-free methods calculate ‘pairwise’ distances between nucleic-acid or protein sequences; these distance values can then be used as input for tree-reconstruction programs such as neighbor-joining. In this paper, we propose the first word-based phylogeny approach that is based on ‘multiple’ sequence comparison and ‘maximum likelihood’. Our algorithm first samples small, gap-free alignments involving four taxa each. For each of these alignments, it then calculates a quartet tree and, finally, the program ‘Quartet MaxCut’ is used to infer a super tree for the full set of input taxa from the calculated quartet trees. Experimental results show that trees produced with our approach are of high quality.

Highlights

  • Sequence-based phylogeny reconstruction is a fundamental task in computational biology

  • Experimental results show that trees produced with our approach are of high quality

  • Standard software tools for phylogeny reconstruction are relatively slow, because they rely on multiple sequence alignments and on time-consuming probabilistic calculations

Read more

Summary

Introduction

Sequence-based phylogeny reconstruction is a fundamental task in computational biology. ‘Character-based’ methods such as ‘Maximum Parsimony’ [1,2] or ‘Maximum Likelihood’ [3] infer trees based on evolutionary substitution events that may have happened since the species evolved from their last common ancestor. These methods are generally considered to be accurate as long as the underlying alignment is of high quality and as long as suitable substitution models are used. For the task of multiple alignment no exact polynomial-time algorithm exists, and even heuristic approaches are relatively time consuming [4]. Exact algorithms for character-based approaches are known to be ‘NP hard’ [5,6]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call