Abstract

Phylogeny usually represented by an unrooted binary tree shows the evolutionary history of a set of species. Phylogenies derived from molecular data may prove crucial in answering some fundamental open questions in molecular evolution, and phylogeny reconstruction is a key research problem in computational biology. Biologists now study genomic sequences: DNA, amino acid, protein, etc. In the past few years, they are embracing a new kind of data, gene order. We study the phylogeny reconstruction problem with maximum parsimony for both gene-order and gene-sequence data. Unlike other researchers who work on heuristic algorithms due to the exponential possible tree topologies, we focus on exact algorithms that give the optimal solution. We adopt new data structures and new algorithms and developed GRAPPA, gene-order based phylogeny reconstruction software. GRAPPA is much faster than its peer BPAnalysis, which is very popular. Some techniques employed in GRAPPA include a fast linear time algorithm to compute the inversion distance between species represented by signed gene-order data and a set of efficient tree generators to enumerate trees. We use the branch and bound approach to prune search space to find the optimal trees for a set of species based on sequence. We put forward some techniques such as rearranging sites to use the property of parsimonious un-informative sites to compute the tree length faster. We also designed a fast new optimization algorithm that takes constant time to compute the length of new trees compatible with a partial tree. This algorithm can not only be used in the branch and bound search, but also can be used in heuristic search techniques like SPR and TBR. To take advantage of the computability of modern computers, we use a simple data structure and a simple but efficient mutual lock mechanism to access the shared data in a high-performance implementation we designed for symmetric multiprocessors (SMPs). We demonstrate the power of these techniques through an extensive performance study based upon simulated evolution under a wide range of model conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.