A method for identification of highly conserved elements and evolutionary analysis of superphylum Alveolata.

Lev I Rubanov,Oleg A Zverkov,Alexandr V Seliverstov,Vassily A Lyubetsky

doi:10.1186/s12859-016-1257-5

Lev I Rubanov, Oleg A Zverkov + Show 2 more

Open Access

https://doi.org/10.1186/s12859-016-1257-5

Copy DOI

Abstract

BackgroundPerfectly or highly conserved DNA elements were found in vertebrates, invertebrates, and plants by various methods. However, little is known about such elements in protists. The evolutionary distance between apicomplexans can be very high, in particular, due to the positive selection pressure on them. This complicates the identification of highly conserved elements in alveolates, which is overcome by the proposed algorithm.ResultsA novel algorithm is developed to identify highly conserved DNA elements. It is based on the identification of dense subgraphs in a specially built multipartite graph (whose parts correspond to genomes). Specifically, the algorithm does not rely on genome alignments, nor pre-identified perfectly conserved elements; instead, it performs a fast search for pairs of words (in different genomes) of maximum length with the difference below the specified edit distance. Such pair defines an edge whose weight equals the maximum (or total) length of words assigned to its ends. The graph composed of these edges is then compacted by merging some of its edges and vertices. The dense subgraphs are identified by a cellular automaton-like algorithm; each subgraph defines a cluster composed of similar inextensible words from different genomes. Almost all clusters are considered as predicted highly conserved elements. The algorithm is applied to the nuclear genomes of the superphylum Alveolata, and the corresponding phylogenetic tree is built and discussed.ConclusionWe proposed an algorithm for the identification of highly conserved elements. The multitude of identified elements was used to infer the phylogeny of Alveolata.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1257-5) contains supplementary material, which is available to authorized users.

Highlights

Or highly conserved DNA elements were found in vertebrates, invertebrates, and plants by various methods
The longest (6.99 Mbp) chromosome of Neospora caninum was collated in turn with three well-assembled full genomes: Babesia microti of 4 chromosomes (6.39 Mbp in total), Cryptosporidium parvum of 8 chromosomes (9.1 Mbp), and Plasmodium falciparum of 16
We presented a novel algorithm to identify highly conserved DNA elements; it was applied to the superphylum Alveolata

Summary

Introduction

Or highly conserved DNA elements were found in vertebrates, invertebrates, and plants by various methods. The evolutionary distance between apicomplexans can be very high, in particular, due to the positive selection pressure on them This complicates the identification of highly conserved elements in alveolates, which is overcome by the proposed algorithm. Introduction Ultraconserved elements (UCEs) are perfectly conserved regions of genomes shared among evolutionary distant taxa. It is assumed that these regions are identical in closely related species and have minor differences in relatively distant ones, which substantially limits the phylogenetic distances. Hundreds of conserved noncoding sequences were detected in four dicotyledonous plant species: Arabidopsis thaliana, Carica papaya, Populus trichocarpa, and Vitis vinifera [3]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Sep 20, 2016
Citations: 64	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A method for identification of highly conserved elements and evolutionary analysis of superphylum Alveolata.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data.
Jahangheer S Shaik ... Asis Khan
BMC Genomics | VOL. 16
Jahangheer S Shaik, et. al.Jahangheer S Shaik ... Asis Khan
26 Feb 2015
BMC Genomics | VOL. 16

PlantOrDB: a genome-wide ortholog database for land plants and green algae.
Lei Li ... Chun Liang
BMC Plant Biology | VOL. 15
Lei Li, et. al.Lei Li ... Chun Liang
26 Jun 2015
BMC Plant Biology | VOL. 15

A better edit distance measure allowing for block swaps
Nhauo Davuth ... Sung-Ryul Kim
-
Nhauo Davuth, et. al.Nhauo Davuth ... Sung-Ryul Kim
01 Oct 2013
01 Oct 2013

WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data.
Gregory K Farrant ... Frédéric Partensky
BMC bioinformatics | VOL. 16
Gregory K Farrant, et. al.Gregory K Farrant ... Frédéric Partensky
03 Sep 2015
BMC bioinformatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A method for identification of highly conserved elements and evolutionary analysis of superphylum Alveolata.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics