Abstract

Whole genome gene order evolution in higher eukaryotes was initially considered as a random process. Gene order conservation or conserved synteny was seen as a feature of common descent and did not imply the existence of functional constraints. This view had to be revised in the light of results from sequencing dozens of vertebrate genomes.It became apparent that other factors exist that constrain gene order in some genomic regions over long evolutionary time periods. Outside of these regions, genomes diverge more rapidly in terms of gene content and order.We have developed CYNTENATOR, a progressive gene order alignment software, to identify genomic regions of conserved synteny over a large set of diverging species. CYNTENATOR does not depend on nucleotide-level alignments and a priori homology assignment. Our software implements an improved scoring function that utilizes the underlying phylogeny.In this manuscript, we report on our progressive gene order alignment approach, a and give a comparison to previous software and an analysis of 17 vertebrate genomes for conservation in gene order.CYNTENATOR has a runtime complexity of and a space complexity of with being the gene number in a genome. CYNTENATOR performs as good as state-of-the-art software on simulated pairwise gene order comparisons, but is the only algorithm that works in practice for aligning dozens of vertebrate-sized gene orders.Lineage-specific characterization of gene order across 17 vertebrate genomes revealed mechanisms for maintaining conserved synteny such as enhancers and coregulation by bidirectional promoters. Genes outside conserved synteny blocks show enrichments for genes involved in responses to external stimuli, stimuli such as immunity and olfactory response in primate genome comparisons. We even see significant gene ontology term enrichments for breakpoint regions of ancestral nodes close to the root of the phylogeny. Additionally, our analysis of transposable elements has revealed a significant accumulation of LINE-1 elements in mammalian breakpoint regions. In summary, CYNTENATOR is a flexible and scalable tool for the identification of conserved gene orders across multiple species over long evolutionary distances.

Highlights

  • Whole genome evolution operates on different levels of detail: from single nucleotides to functional elements to whole chromosomes [1]

  • We evaluated the performance of CYNTENATOR, MCMuSec and OrthoCluster on the 20 simulated genome pairs

  • We evaluated the effect of parameter choice on ortholog recovery by computing humanzebrafish gene order alignments using various penalty combination and apart from that default parameters

Read more

Summary

Introduction

Whole genome evolution operates on different levels of detail: from single nucleotides to functional elements (e.g. genes) to whole chromosomes [1]. An interesting phenomenon in the evolution of whole genomes is the existence of conserved synteny, which is the maintenance of gene content and order in certain chromosomal regions of two or more related species. Ever since Nadeau and Taylor [2] published their groundbreaking paper on the distribution of synteny breakpoints in the human and mouse genome, it was commonly believed that breakpoints are essentially distributed at random. Several invertebrate genomes contain operons (e.g. nematodes [3] and ascidians [4]), where gene order is functionally constrained by the necessity to generate a poly-cistronic messenger RNA. Pevzner and Tesler [5]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call