PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees

Philippe Gouret,Pierre Pontarotti,Julie D Thompson

doi:10.1186/1471-2105-10-298

Abstract

BackgroundTo effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise. Complex architectures must be recognized within the trees, so that associated information can be extracted.ResultsHere, we present a new software library, PhyloPattern, for automating tree manipulations and analysis. PhyloPattern includes three main modules, which address essential tasks in high-throughput phylogenetic tree analysis: node annotation, pattern matching, and tree comparison. PhyloPattern thus allows the programmer to focus on: i) the use of predefined or user defined annotation functions to perform immediate or deferred evaluation of node properties, ii) the search for user-defined patterns in large phylogenetic trees, iii) the pairwise comparison of trees by dynamically generating patterns from one tree and applying them to the other.ConclusionPhyloPattern greatly simplifies and accelerates the work of the computer scientist in the evolutionary biology field. The library has been used to automatically identify phylogenetic evidence for domain shuffling or gene loss events in the evolutionary histories of protein sequences. However any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern.

Highlights

To effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise
Any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern
The strategy we adopted in the development of PhyloPattern, was first to understand how a biologist reads and uses a phylogenetic tree and, from this, we deduced three main functionalities that are essential for most tree analyses: tree annotation, pattern matching and trees comparison

Summary

Results

The strategy we adopted in the development of PhyloPattern, was first to understand how a biologist reads and uses a phylogenetic tree and, from this, we deduced three main functionalities that are essential for most tree analyses: tree annotation, pattern matching and trees comparison. An evolutionary strategy might be to construct phylogenetic trees for the two protein domains (based on a multiple sequence alignment) http://www.biomedcentral.com/1471-2105/10/298 and to perform the following steps with the PhyloPattern API: Step 1 annotate the leaves of the trees with the domain architectures of associated proteins (using a protein domain database) and annotate internal nodes of the trees by inferring their domain architectures from the leaf domain architectures (using for example the Dollo parsimony algorithm [18] or Maximum Likelihood methods [19]), Step 2 define a pattern with constraints mainly based on the domain architecture tag to try to find a parent node of a shuffling event and apply it to each tree (see pattern schema at the top of Figure 3), Step 3 if such a node is found, annotate each tree, by adding event tags to derived nodes found under the event's parent node, Step 4 apply two patterns to each tree; first to extract a common leaf (same name) from each "event marked" subtree and second to extract an "ancestral" leaf (with the "parent" domain architecture). The subtrees with sequences: [ENSP00000312158, ENSPTR00000056995, ENSMMUP00000028928] match and the other subtrees do not

Conclusion

Background

Dobzhansky T

16. McCarthy J

19. Felsenstein J

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Sep 19, 2009
Citations: 81	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

The gene characterizationof VP1 and 2A of Enterovirus type 71
...
Chinese journal of microbiology and immunology | VOL. 31
, et. al. ...
31 Mar 2011
The gene characterizationof VP1 and 2A of Enterovirus type 71
...

Are polyploids really evolutionary dead-ends (again)? A critical reappraisal of Mayrose etal. ().
Douglas E Soltis ... María Claudia Segovia‐Salcedo
New Phytologist | VOL. 202
Douglas E Soltis, et. al.Douglas E Soltis ... María Claudia Segovia‐Salcedo
22 Apr 2014
New Phytologist | VOL. 202

Molecular comparative studies among Blastocystis isolates obtained from humans and animals.
Hisao Yoshikawa ... Isao Nagano
The Journal of parasitology | VOL. 89
Hisao Yoshikawa, et. al.Hisao Yoshikawa ... Isao Nagano
01 Jun 2003
The Journal of parasitology | VOL. 89

Genetic characterization of avian influenza A(H5N6) virus from two patients in Yunnan province
...
-
, et. al. ...
25 Dec 2018
25 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics